Instructions

How to use the AI API cost calculator

Compare 63 models across 11 providers in four steps. Set one workload, read the monthly estimate, rank every model, and tune cache-aware token math. Pricing reference updated Jun 2026.

Quick start (under 2 minutes)

  1. 01

    Set your workload

  2. 02

    Review the live estimate

  3. 03

    Compare all models

  4. 04

    Fine-tune token mix (optional)

Workload presets

Presets fill messages, tokens, and token-mix defaults. Click one on the calculator, then adjust if your traffic differs.

  • Support bot

    High volume, short replies

    50,000 / mo · 800 avg

    Customer support, FAQ bots, ticket triage

  • Code assistant

    Medium volume, long context

    5,000 / mo · 4,000 avg

    IDE copilots, PR review, refactors

  • RAG / search

    Retrieval-heavy prompts

    10,000 / mo · 2,500 avg

    Document Q&A, knowledge bases, search-augmented apps

  • Agent / tools

    Multi-step, mixed I/O

    2,000 / mo · 8,000 avg

    Tool-calling agents, workflows, multi-turn reasoning

01

Set your workload

Start with a preset that matches your app, or enter your own message volume and average tokens per message.

  • Messages per month = API calls or chat turns (one user request + model reply).
  • Tokens per message = prompt + completion size. Use your analytics average, or start with a preset.
  • Presets pre-fill input/output and cache ratios for common product shapes.

02

Review the live estimate

The calculator shows estimated monthly spend for your selected model and where it ranks against every other priced model.

  • Monthly cost updates instantly when you change workload inputs.
  • Token breakdown splits input, cached input, and output spend.
  • Rank shows position vs all models — #1 is cheapest for the same workload.

03

Compare all models

Use the ranking table and top picks to find the lowest-cost model that still fits your quality bar.

  • Filter by provider (OpenAI, Anthropic, Google, Mistral, etc.).
  • Search by model name. Click any row to inspect it in the estimate panel.
  • Compare headline list prices fairly — one workload baseline for every provider.

04

Fine-tune token mix (optional)

Open Advanced token mix when RAG, agents, or long-context apps need a more accurate split.

  • Input vs output ratio — share of tokens sent as prompt/context vs model reply.
  • Cached input ratio — share of prompt tokens served from provider cache (lower rate when available).
  • Per-message token summary updates live so you can sanity-check the math.

Embed on your site

Add the calculator to docs, internal tools, or landing pages with the embed widget. It loads the same live pricing engine as the main site.

FAQ

Common questions about using the calculator

ready to compare?

Run your workload through 63 models now

Same baseline, every provider — updated Jun 2026.