Whether you're building apps, debugging, or writing scripts, these models score highest on coding benchmarks. Ranked by real coding evaluation data updated daily.
GPT-5.4 (xhigh) leads the field with a coding score of 57.3, demonstrating the strongest code generation accuracy across benchmarks.
Ranked by coding benchmark score, which measures code generation accuracy across multiple programming languages and problem types.
100 requests/day at 2,000 tokens each (input + output)
| # | Model | Provider | Intelligence | Coding | Speed | Price/1M | |
|---|---|---|---|---|---|---|---|
| 1 | GPT-5.4 (xhigh) | OpenAI | 57.2 | 57.3 | 77 tok/s | $2.50 | |
| 2 | Gemini 3.1 Pro Preview | 57.2 | 55.5 | 113 tok/s | $2.00 | ||
| 3 |
Based on our benchmark rankings, GPT-5.4 (xhigh) is currently the top-ranked model for coding. See the full rankings above for alternatives.
GPT-5.4 (xhigh) costs $2.50/1M input tokens. For 100 requests per day at 2,000 tokens each, that's approximately $15.00/month.
Currently, there are no free models that rank highly for coding. Check our free models page for the best zero-cost options.
Take our 30-second quiz and get a personalized recommendation.
Take the Quiz| GPT-5.3 Codex (xhigh) |
| OpenAI |
| 54.0 |
| 53.1 |
| 72 tok/s |
| $1.75 |
| 4 | GPT-5.4 mini (xhigh) | OpenAI | 48.1 | 51.5 | 218 tok/s | $0.75 |
| 5 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) | Anthropic | 51.7 | 50.9 | 71 tok/s | $3.00 |
| 6 | GPT-5.2 (xhigh) | OpenAI | 51.3 | 48.7 | 70 tok/s | $1.75 |
| 7 | Claude Opus 4.6 (Adaptive Reasoning, Max Effort) | Anthropic | 53.0 | 48.1 | 51 tok/s | $5.00 |
| 8 | Claude Opus 4.5 (Reasoning) | Anthropic | 49.7 | 47.8 | 57 tok/s | $5.00 |
| 9 | Gemini 2.5 Pro Preview (Mar' 25)Free | 30.3 | 46.7 | 0 tok/s | Free |
| 10 | Gemini 3 Pro Preview (high) | 48.4 | 46.5 | 115 tok/s | $2.00 |
| 11 | GPT-5.1 (high) | OpenAI | 47.7 | 44.7 | 92 tok/s | $1.25 |
| 12 | Z | GLM-5 (Reasoning) | Z AI | 49.8 | 44.2 | 66 tok/s | $1.00 |
| 13 | GPT-5.4 nano (xhigh) | OpenAI | 44.4 | 43.9 | 216 tok/s | $0.20 |
| 14 | GPT-5.2 Codex (xhigh) | OpenAI | 49.0 | 43.0 | 99 tok/s | $1.75 |
| 15 | Gemini 3 Flash Preview (Reasoning) | 46.4 | 42.6 | 192 tok/s | $0.50 |
| 16 | X | Grok 4.20 Beta 0309 (Reasoning) | xAI | 48.5 | 42.2 | 246 tok/s | $2.00 |
| 17 | MiniMax-M2.7 | MiniMax | 49.6 | 41.9 | 45 tok/s | $0.30 |
| 18 | X | MiMo-V2-Pro | Xiaomi | 49.2 | 41.4 | 91 tok/s | $1.00 |
| 19 | A | Qwen3.5 397B A17B (Reasoning) | Alibaba | 45.0 | 41.3 | 53 tok/s | $0.60 |
| 20 | X | Grok 4 | xAI | 41.5 | 40.5 | 44 tok/s | $3.00 |
| 21 | K | Kimi K2.5 (Reasoning) | Kimi | 46.8 | 39.5 | 34 tok/s | $0.60 |
| 22 | GPT-5 Codex (high) | OpenAI | 44.6 | 38.9 | 180 tok/s | $1.25 |
| 23 | Claude 4.5 Sonnet (Reasoning) | Anthropic | 43.0 | 38.6 | 52 tok/s | $3.00 |
| 24 | o3 | OpenAI | 38.4 | 38.4 | 72 tok/s | $2.00 |
| 25 | DeepSeek V3.2 SpecialeFree | DeepSeek | 29.4 | 37.9 | 0 tok/s | Free |
| 26 | MiniMax-M2.5 | MiniMax | 41.9 | 37.4 | 52 tok/s | $0.30 |
| 27 | Z | GLM-5-TurboFree | Z AI | 46.8 | 36.8 | 0 tok/s | Free |
| 28 | DeepSeek V3.2 (Reasoning) | DeepSeek | 41.7 | 36.7 | 32 tok/s | $0.28 |
| 29 | GPT-5.1 Codex (high) | OpenAI | 43.1 | 36.6 | 118 tok/s | $1.25 |
| 30 | Claude 4.1 Opus (Reasoning) | Anthropic | 42.0 | 36.5 | 38 tok/s | $15 |
| 31 | GPT-5.1 Codex mini (high) | OpenAI | 38.6 | 36.4 | 176 tok/s | $0.25 |
| 32 | Z | GLM-4.7 (Reasoning) | Z AI | 42.1 | 36.3 | 80 tok/s | $0.60 |
| 33 | GPT-5 (high) | OpenAI | 44.6 | 36.0 | 83 tok/s | $1.25 |
| 34 | X | MiMo-V2-OmniFree | Xiaomi | 43.4 | 35.5 | 0 tok/s | Free |
| 35 | GPT-5 mini (high) | OpenAI | 41.2 | 35.3 | 76 tok/s | $0.25 |
| 36 | A | Qwen3.5 27B (Reasoning) | Alibaba | 42.1 | 34.9 | 91 tok/s | $0.30 |
| 37 | K | Kimi K2 Thinking | Kimi | 40.9 | 34.8 | 89 tok/s | $0.60 |
| 38 | A | Qwen3.5 122B A10B (Reasoning) | Alibaba | 41.6 | 34.7 | 133 tok/s | $0.40 |
| 39 | Claude 4 Sonnet (Reasoning) | Anthropic | 38.7 | 34.1 | 52 tok/s | $3.00 |
| 40 | o1-preview | OpenAI | 23.7 | 34.0 | 0 tok/s | $17 |
| 41 | Claude 4 Opus (Reasoning) | Anthropic | 39.0 | 34.0 | 38 tok/s | $15 |
| 42 | DeepSeek V3.1 Terminus (Reasoning) | DeepSeek | 33.9 | 33.7 | 0 tok/s | $0.40 |
| 43 | X | MiMo-V2-Flash (Feb 2026) | Xiaomi | 41.5 | 33.5 | 128 tok/s | $0.10 |
| 44 | DeepSeek V3.2 Exp (Reasoning) | DeepSeek | 32.9 | 33.3 | 32 tok/s | $0.28 |
| 45 | MiniMax-M2.1 | MiniMax | 39.4 | 32.8 | 51 tok/s | $0.30 |
| 46 | Claude 4.5 Haiku (Reasoning) | Anthropic | 37.1 | 32.6 | 144 tok/s | $1.00 |
| 47 | Gemini 2.5 Pro | 34.6 | 31.9 | 124 tok/s | $1.25 |
| 48 | S | Step 3.5 Flash | StepFun | 37.8 | 31.6 | 85 tok/s | $0.10 |
| 49 | B | Doubao Seed CodeFree | ByteDance Seed | 33.5 | 31.3 | 0 tok/s | Free |
| 50 | NVIDIA Nemotron 3 Super 120B A12B (Reasoning) | NVIDIA | 36.0 | 31.2 | 365 tok/s | $0.30 |