Advanced cost modeling with multi-model comparison, intelligent routing, and break-even analysis. Save up to 95% on inference costs.
Built from real production data across multiple deployments. Prices updated weekly. Includes hidden costs like rate limit retries and timeout handling.
Model | Provider | Input | Output | Cached Input | Context |
---|---|---|---|---|---|
GPT-4 Turbo | OpenAI | $10.00 | $30.00 | $5.00 | 128K |
GPT-4o | OpenAI | $5.00 | $15.00 | $2.50 | 128K |
GPT-3.5 Turbo | OpenAI | $0.50 | $1.50 | $0.25 | 16K |
Claude 3 Opus | Anthropic | $15.00 | $75.00 | $7.50 | 200K |
Claude 3.5 Sonnet | Anthropic | $3.00 | $15.00 | $1.50 | 200K |
Gemini Pro | $0.50 | $1.50 | $0.25 | 32K | |
Mixtral 8x7B | Mistral | $0.70 | $0.70 | $0.35 | 32K |
Llama 3 70B | Self-hosted | $0.20* | $0.20* | $0.10* | 8K |