LLM API Cost Calculator

Enter your usage profile and instantly compare what every major LLM provider would charge — per request and per month. The default scenario below is real, server-rendered output; change any field to recalculate live.

Quick scenarios

Tip: you can type 4k, 100k or 1.5m.

Advanced options

Estimated cost — per month

Shown in USD. Pricing last verified 2026-06-19.

API price = the provider's list price per 1M tokens (input → output). Your cost = estimate for the usage profile you set on the left (4,000 in + 1,000 out per request).

Showing 50 of 60 models.

Show:

Tip: click a column header to sort, or filter by category above. Sorted by cost per request by default.

Provider / Model Context API price / 1M in → out Your cost / request Your cost / month Links
OpenAI gpt-oss-20b fast cheapest batch 131K $0.0290 $0.1400 $0.000256 $0.2560 Visit Review
OpenAI gpt-oss-120b fast batch 131K $0.0390 $0.1800 $0.000336 ×1.3 $0.3360 Visit Review
OpenAI gpt-oss-safeguard-20b fast cache batch 131K $0.0750 $0.3000 $0.000600 ×2.3 $0.6000 Visit Review
OpenAI GPT-5 Nano fast cache batch 400K $0.0500 $0.4000 $0.000600 ×2.3 $0.6000 Visit Review
OpenAI GPT-4.1 Nano fast cache batch 1M $0.1000 $0.4000 $0.000800 ×3.1 $0.8000 Visit Review
OpenAI GPT-4o-mini Search Preview fast batch 128K $0.1500 $0.6000 $0.001200 ×4.7 $1.20 Visit Review
OpenAI GPT-4o-mini (2024-07-18) fast cache batch 128K $0.1500 $0.6000 $0.001200 ×4.7 $1.20 Visit Review
OpenAI GPT-4o-mini fast cache batch 128K $0.1500 $0.6000 $0.001200 ×4.7 $1.20 Visit Review
OpenAI GPT-5.4 Nano fast cache batch 400K $0.2000 $1.25 $0.002050 ×8.0 $2.05 Visit Review
OpenAI GPT-5.1-Codex-Mini fast cache batch 400K $0.2500 $2.00 $0.003000 ×11.7 $3.00 Visit Review
OpenAI GPT-5 Mini fast cache batch 400K $0.2500 $2.00 $0.003000 ×11.7 $3.00 Visit Review
OpenAI GPT-4.1 Mini fast cache batch 1M $0.4000 $1.60 $0.003200 ×12.5 $3.20 Visit Review
OpenAI GPT-3.5 Turbo fast batch 16K $0.5000 $1.50 $0.003500 ×13.7 $3.50 Visit Review
OpenAI GPT Audio Mini fast batch 128K $0.6000 $2.40 $0.004800 ×18.7 $4.80 Visit Review
OpenAI GPT-3.5 Turbo (older v0613) fast batch 4K $1.00 $2.00 $0.006000 ×23.4 $6.00 Visit Review
OpenAI GPT-5.4 Mini fast cache batch 400K $0.7500 $4.50 $0.007500 ×29.3 $7.50 Visit Review
OpenAI GPT-3.5 Turbo Instruct fast batch 4K $1.50 $2.00 $0.008000 ×31.2 $8.00 Visit Review
OpenAI o4 Mini High fast cache batch 200K $1.10 $4.40 $0.008800 ×34.4 $8.80 Visit Review
OpenAI o4 Mini fast cache batch 200K $1.10 $4.40 $0.008800 ×34.4 $8.80 Visit Review
OpenAI o3 Mini High fast cache batch 200K $1.10 $4.40 $0.008800 ×34.4 $8.80 Visit Review
OpenAI o3 Mini fast cache batch 200K $1.10 $4.40 $0.008800 ×34.4 $8.80 Visit Review
OpenAI GPT-5 Image Mini fast cache batch 400K $2.50 $2.00 $0.0120 ×46.9 $12.00 Visit Review
OpenAI GPT-5.1-Codex-Max balanced cache batch 400K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-5.1 balanced cache batch 400K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-5.1 Chat balanced cache batch 128K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-5.1-Codex balanced cache batch 400K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-5 Codex balanced cache batch 400K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-5 Chat balanced cache batch 128K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-5 balanced cache batch 400K $1.25 $10.00 $0.0150 ×58.6 $15.00 Visit Review
OpenAI GPT-3.5 Turbo 16k fast batch 16K $3.00 $4.00 $0.0160 ×62.5 $16.00 Visit Review
OpenAI o4 Mini Deep Research balanced cache batch 200K $2.00 $8.00 $0.0160 ×62.5 $16.00 Visit Review
OpenAI o3 balanced cache batch 200K $2.00 $8.00 $0.0160 ×62.5 $16.00 Visit Review
OpenAI GPT-4.1 balanced cache batch 1M $2.00 $8.00 $0.0160 ×62.5 $16.00 Visit Review
OpenAI GPT Audio balanced batch 128K $2.50 $10.00 $0.0200 ×78.1 $20.00 Visit Review
OpenAI GPT-4o Search Preview balanced batch 128K $2.50 $10.00 $0.0200 ×78.1 $20.00 Visit Review
OpenAI GPT-4o (2024-11-20) balanced cache batch 128K $2.50 $10.00 $0.0200 ×78.1 $20.00 Visit Review
OpenAI GPT-4o (2024-08-06) balanced cache batch 128K $2.50 $10.00 $0.0200 ×78.1 $20.00 Visit Review
OpenAI GPT-4o balanced batch 128K $2.50 $10.00 $0.0200 ×78.1 $20.00 Visit Review
OpenAI GPT-5.3 Chat balanced cache batch 128K $1.75 $14.00 $0.0210 ×82.0 $21.00 Visit Review
OpenAI GPT-5.3-Codex balanced cache batch 400K $1.75 $14.00 $0.0210 ×82.0 $21.00 Visit Review
OpenAI GPT-5.2-Codex balanced cache batch 400K $1.75 $14.00 $0.0210 ×82.0 $21.00 Visit Review
OpenAI GPT-5.2 Chat balanced cache batch 128K $1.75 $14.00 $0.0210 ×82.0 $21.00 Visit Review
OpenAI GPT-5.2 balanced cache batch 400K $1.75 $14.00 $0.0210 ×82.0 $21.00 Visit Review
OpenAI GPT-5.4 frontier cache batch 1.1M $2.50 $15.00 $0.0250 ×97.7 $25.00 Visit Review
OpenAI GPT-4o (2024-05-13) frontier batch 128K $5.00 $15.00 $0.0350 ×137 $35.00 Visit Review
OpenAI GPT-5.4 Image 2 frontier cache batch 272K $8.00 $15.00 $0.0470 ×184 $47.00 Visit Review
OpenAI GPT-5 Image balanced cache batch 400K $10.00 $10.00 $0.0500 ×195 $50.00 Visit Review
OpenAI GPT Chat Latest frontier cache batch 400K $5.00 $30.00 $0.0500 ×195 $50.00 Visit Review
OpenAI GPT-5.5 frontier cache batch 1.1M $5.00 $30.00 $0.0500 ×195 $50.00 Visit Review
OpenAI GPT-4 Turbo frontier batch 128K $10.00 $30.00 $0.0700 ×273 $70.00 Visit Review

Estimates only. Actual bills depend on exact token counts, tier pricing and provider changes. Always confirm on the provider's pricing page.

How LLM API pricing works

Every major LLM provider bills by the token — a chunk of text roughly ¾ of a word in English. You pay separately for input tokens (everything you send: system prompt, retrieved context and the user message) and output tokens (what the model writes back). Output is typically priced two to five times higher than input, which is why concise responses save real money at scale.

The formula

cost_per_request = (input_tokens  / 1,000,000) × input_price_per_M
                 + (output_tokens / 1,000,000) × output_price_per_M
cost_per_period  = cost_per_request × requests_in_period

What moves the number

Frequently asked questions

How are LLM API costs calculated?

Providers bill per million tokens, separately for input (your prompt) and output (the model's response). Cost per request = (input tokens / 1,000,000 × input price) + (output tokens / 1,000,000 × output price). Multiply by your request volume for the period.

What is the difference between input and output tokens?

Input tokens are everything you send to the model — system prompt, context and user message. Output tokens are what the model generates. Output is usually priced several times higher than input, so response length matters a lot for cost.

Does prompt caching reduce cost?

Yes, where supported. Repeated prompt prefixes (e.g. a long system prompt) can be billed at a steep discount. The calculator lets you set what share of input tokens are cached.

Why do prices vary so much between providers?

Model size, hardware efficiency, context window, and business strategy all play a part. Frontier reasoning models cost the most; fast/cheap and open-weight self-hosted options can be orders of magnitude cheaper for suitable tasks.