Best AI Models Under $1 Per Million Tokens in 2026

Frontier quality isn't cheap, but these budget models handle 80% of tasks for pennies.

Updated March 27, 202611 min read

Key Takeaways

-GLM-5 at $1/$3.20 offers 49.8 intelligence — the best quality under $1/1M input tokens.
-GPT-5.4 Mini at $0.75/$4.50 scores 48.1 intelligence and 51.5 coding — the best for coding on a budget.
-Gemini 3 Flash at $0.50/$3 has 46.4 intelligence with 192 tok/s speed — the fastest budget option.
-GPT-5.4 Nano at $0.20/$1.25 handles simple tasks at negligible cost (44.4 intelligence).
-Gemma 3n at $0.02/$0.07 is effectively free for classification and extraction tasks.

You don't always need a $5/million-token model. For many tasks — classification, extraction, summarization, basic coding, and routine Q&A — models under $1 per million input tokens deliver perfectly adequate results. We ranked every model in this price tier to find the best values.

The Budget Tier Has Gotten Good

A year ago, models under $1/1M tokens were barely useful for serious work. Today, the budget tier includes models with intelligence scores above 45 — a level that handles most practical tasks competently.

The improvement comes from two directions: frontier models from 12-18 months ago getting cheaper as newer models replace them, and new efficiency-focused architectures (like MoE) that deliver high capability at low cost.

For startups, hobby projects, and high-volume production workloads, the budget tier is now the smart default. You should only step up to frontier pricing when the task specifically requires it.

Best Quality: GLM-5 at $1/$3.20

Zhipu's GLM-5 leads the budget tier with a 49.8 intelligence score — higher than Claude Opus 4.5 and just 3 points below Claude Sonnet 4.6. At $1 per million input tokens and $3.20 per million output tokens, it costs 60-80% less than the frontier models.

The coding score of 44.2 is strong for this price point. GLM-5 handles software development tasks, analytical reasoning, and creative writing at a quality level that was frontier-only six months ago.

The main limitation is that GLM-5 is from a Chinese provider, which means API latency from Western locations can be higher and documentation is less comprehensive than US providers.

Best for Coding: GPT-5.4 Mini at $0.75/$4.50

GPT-5.4 Mini's 51.5 coding score makes it the best coding model in the budget tier — and it actually outscores several models that cost 5-10x more. At 218 tok/s, it's also blazingly fast.

For code completion, test generation, documentation, debugging, and routine refactoring, Mini handles the work that GPT-5.4 handles but at 30% of the cost and 3x the speed. The tradeoff is a lower intelligence score (48.1) that shows on complex architectural decisions.

This is the model to use for your IDE coding assistant, automated test generation, and code review at scale.

Best Speed: Gemini 3 Flash at $0.50/$3

Google's Gemini 3 Flash delivers 192 tokens per second with a 46.4 intelligence score. That combination of speed and quality at $0.50 per million input tokens makes it the best choice for real-time applications where responsiveness matters.

Chatbots, interactive assistants, real-time search, and live coding suggestions all benefit from Flash's speed. The quality is good enough for these interactive use cases where users value a fast, good-enough response over a slow, perfect one.

The Sub-$0.25 Tier: When Cost Is Everything

For the highest-volume, lowest-complexity tasks:

GPT-5.4 Nano ($0.20/$1.25): 44.4 intelligence. Handles classification, extraction, formatting, and simple Q&A. The cheapest major-provider model with meaningful capability.

GPT-5 Nano ($0.05/$0.25): 26.8 intelligence at minimal quality. For basic routing and classification only.

Gemma 3n ($0.02/$0.07): The cheapest benchmarked model. Intelligence of 6.4 means it's only useful for the simplest tasks, but at $0.02/1M tokens, the cost is essentially zero.

For production systems processing millions of requests daily, these micro-models run the simple routing and classification layers while expensive models handle the hard tasks.

Methodology

All models priced under $1 per million input tokens. Rankings from Artificial Analysis Intelligence and Coding indices. Speed measurements from AA median P50.

The Verdict

GLM-5 for best quality under $1. GPT-5.4 Mini for coding on a budget. Gemini 3 Flash for speed-critical applications. GPT-5.4 Nano for rock-bottom pricing. The budget tier now delivers what was frontier-only a year ago.

Published June 5, 2026. Data updated daily from independent benchmarks and API providers.