Loading...
Loading...
Compare monthly costs across 24+ LLM APIs from OpenAI, Anthropic, Google, xAI, DeepSeek, Mistral, Meta, and Amazon. Factor in caching discounts and batch pricing.
| Model | Monthly | Cached | Per request | Annual | Speed |
|---|---|---|---|---|---|
Nova Lite Amazon | $5.40 | $5.40 | $0.0002 | $64.80 | fast |
Mistral Small Mistral | $7.50 | $7.50 | $0.0003 | $90.00 | fast |
gpt-4.1-nano OpenAI | $9.00 | $8.32 −8% | $0.0003 | $108.00 | fast |
Gemini 2.0 Flash Google | $9.00 | $8.32 −8% | $0.0003 | $108.00 | fast |
gpt-4o-mini OpenAI | $13.50 | $12.82 −5% | $0.0004 | $162.00 | fast |
Gemini 2.5 Flash Google | $13.50 | $12.49 −8% | $0.0004 | $162.00 | fast |
Grok 3 Mini xAI | $16.50 | $16.50 | $0.0005 | $198.00 | fast |
Codestral Mistral | $22.50 | $22.50 | $0.0008 | $270.00 | fast |
DeepSeek V3 DeepSeek | $24.60 | $22.80 −7% | $0.0008 | $295.20 | fast |
Llama 4 Maverick Meta (Groq) | $26.55 | $26.55 | $0.0009 | $318.60 | fast |
Llama 3.3 70B Meta (Groq) | $29.55 | $29.55 | $0.0010 | $354.60 | fast |
gpt-4.1-mini OpenAI | $36.00 | $33.30 −8% | $0.0012 | $432.00 | fast |
DeepSeek R1 DeepSeek | $49.35 | $45.66 −7% | $0.0016 | $592.20 | medium |
Nova Pro Amazon | $72.00 | $72.00 | $0.0024 | $864.00 | medium |
Claude Haiku 3.5 Anthropic | $84.00 | $77.52 −8% | $0.0028 | $1.01k | fast |
o3-mini OpenAI | $99.00 | $91.58 −7% | $0.0033 | $1.19k | medium |
o4-mini OpenAI | $99.00 | $91.58 −7% | $0.0033 | $1.19k | medium |
Mistral Large Mistral | $150.00 | $150.00 | $0.0050 | $1.80k | medium |
gpt-4.1 OpenAI | $180.00 | $166.50 −7% | $0.0060 | $2.16k | medium |
Gemini 2.5 Pro Google | $187.50 | $179.09 −4% | $0.0063 | $2.25k | medium |
gpt-4o OpenAI | $225.00 | $213.75 −5% | $0.0075 | $2.70k | medium |
Claude Sonnet 4 Anthropic | $315.00 | $291.37 −8% | $0.010 | $3.78k | medium |
Grok 3 xAI | $315.00 | $315.00 | $0.010 | $3.78k | medium |
o3 OpenAI | $900.00 | $832.50 −7% | $0.030 | $10.8k | slow |
Claude Opus 4 Anthropic | $1.57k | $1.46k −8% | $0.052 | $18.9k | slow |
Prices are per-token API costs from official provider pricing pages (as of May 2025). Actual costs may vary with volume discounts, committed use agreements, or regional pricing.
Cached pricing reflects prompt caching (available on Anthropic, OpenAI, Google, DeepSeek). Batch pricing reflects async/batch API discounts where available.
LLM providers charge per token — a token is roughly 4 characters or ¾ of a word. Costs are split into input tokens (what you send) and output tokens (what the model generates). This calculator multiplies your token volumes by each model's per-token price and scales to your daily request volume.
Prompt caching stores frequently reused prompt prefixes so you don't pay full price on repeat requests. Anthropic, OpenAI, Google, and DeepSeek all offer caching. Savings range from 75% (Anthropic) to 90% (Google) on cached input tokens. Set the cache hit rate slider to see the impact on your costs.
Batch APIs let you submit requests in bulk and receive results asynchronously (usually within 24 hours). OpenAI and Anthropic offer 50% discounts on batch requests. This is ideal for data processing, content generation, and evaluation tasks that don't need real-time responses.
For high-volume, cost-sensitive workloads: Amazon Nova Lite ($0.06/1M input), Google Gemini 2.0 Flash ($0.10/1M), Mistral Small ($0.10/1M), and OpenAI GPT-4.1-nano ($0.10/1M) are the most affordable. DeepSeek V3 ($0.27/1M) offers the best quality-to-cost ratio among budget models.
Prices reflect official API pricing as of May 2025. We update the calculator whenever major providers announce pricing changes. If you notice outdated pricing, let us know.
Use our API Router to get personalized model recommendations based on your task, or compare model capabilities side by side.