There are currently 18 multimodal ai models tracked on AIToolRank. 3 of these are free to use. Paid model pricing starts at $0.06/1M input tokens. The largest context window in this category is 1.0M tokens.
Models
18
Free Models
3
Starting Price
$0.06/1M
Max Context
1.0M
18 models
| # | Model | Provider | Context | Max Output | Input $/1M | Output $/1M | Modalities |
|---|---|---|---|---|---|---|---|
| 1 | Gemini 2.5 Pro | 1.0M | N/A | $1.25 | $10 | Text + Image + File + Audio + Video → Text | |
| 2 | Gemini 3.1 Pro Preview | 1.0M | N/A | $2.00 | $12 |
| Audio + File + Image + Text + Video → Text |
| 3 | GPT-4.1 nano | OpenAI | 1.0M | N/A | $0.10 | $0.40 | Image + Text + File → Text |
| 4 | GPT-4.1 | OpenAI | 1.0M | N/A | $2.00 | $8.00 | Image + Text + File → Text |
| 5 | GPT-4.1 mini | OpenAI | 1.0M | N/A | $0.40 | $1.60 | Image + Text + File → Text |
| 6 | Nova Premier | Amazon | 1M | N/A | $2.50 | $13 | Text + Image → Text |
| 7 | Nova Lite | Amazon | 300K | N/A | $0.06 | $0.24 | Text + Image → Text |
| 8 | Nova Pro | Amazon | 300K | N/A | $0.80 | $3.20 | Text + Image → Text |
| 9 | Grok 4 | xAI | 256K | N/A | $3.00 | $15 | Image + Text → Text |
| 10 | Claude 3.5 Haiku | Anthropic | 200K | N/A | $0.80 | $4.00 | Text + Image → Text |
| 11 | Claude 3 Haiku | Anthropic | 200K | N/A | $0.25 | $1.25 | Text + Image → Text |
| 12 | o3 | OpenAI | 200K | N/A | $2.00 | $8.00 | Image + Text + File → Text |
| 13 | Sonar ProFree | Perplexity | 200K | N/A | Free | Free | Text + Image → Text |
| 14 | o1-pro | OpenAI | 200K | N/A | $150 | $600 | Text + Image + File → Text |
| 15 | o1 | OpenAI | 200K | N/A | $15 | $60 | Text + Image + File → Text |
| 16 | Sonar Reasoning ProFree | Perplexity | 128K | N/A | Free | Free | Text + Image → Text |
| 17 | GPT-4 Turbo | OpenAI | 128K | N/A | $10 | $30 | Text + Image → Text |
| 18 | SonarFree | Perplexity | 127K | N/A | Free | Free | Text + Image → Text |