RESEARCH
Provider price benchmarks.
Last updated: 24 April 2026 · refreshed quarterly
Reference prices per 1M tokens across the five major frontier providers. Numbers come from public price pages on the date above. Cache-read and batch-API discounts are tracked separately because they're billed separately on real invoices.
How to use: these are list prices. Real engagement pricing factors enterprise discounts, committed-use, and cross-region routing. Treat the numbers below as an upper bound, not a quote.
Frontier-tier reasoning models
| OpenAI · gpt-5 (input) | $10.00 / 1M |
| OpenAI · gpt-5 (output) | $30.00 / 1M |
| Anthropic · claude-opus-4 (input) | $15.00 / 1M |
| Anthropic · claude-opus-4 (output) | $75.00 / 1M |
| Google · gemini-2.5-pro (input) | $1.25 / 1M |
| Google · gemini-2.5-pro (output) | $10.00 / 1M |
Mid-tier workhorses
| OpenAI · gpt-5-mini (input) | $0.25 / 1M |
| OpenAI · gpt-5-mini (output) | $2.00 / 1M |
| Anthropic · claude-sonnet-4 (input) | $3.00 / 1M |
| Anthropic · claude-sonnet-4 (output) | $15.00 / 1M |
| Google · gemini-2.5-flash (input) | $0.30 / 1M |
| Google · gemini-2.5-flash (output) | $2.50 / 1M |
Discount mechanics
| OpenAI prompt cache (cache-read) | ~50% off input price |
| Anthropic prompt cache (cache-read, 5-min) | ~90% off input price |
| Anthropic extended cache (cache-read, 1-hr) | ~90% off input price · 25% premium on writes |
| OpenAI Batch API | 50% off list (24-hr SLA) |
| Anthropic Batch API | 50% off list (24-hr SLA) |
Why this list is short
We track ~120 model SKUs across providers internally for pricing math. We're not publishing the full table because it goes stale within a quarter and nobody benefits from a wrong number on a search-engine result. If you need a specific model's current price for a procurement document, email us — we'll send our internal sheet.
Methodology
- Prices are pulled directly from each provider's public price page on the "last updated" date.
- We don't include prices behind enterprise sales paywalls.
- Bedrock/Vertex/Azure pricing for OpenAI/Anthropic/Google models matches first-party where the providers publish parity; cross-region surcharges aren't included.