API Pricing

DeepSeek API Pricing & Cost Calculator (2026)

Current DeepSeek API pricing per million tokens for V4 Pro and V4 Flash, with a live calculator, cache-pricing maths, a self-hosting comparison and worked examples.

Updated 2026-06-17 · prices auto-refreshed

Compare DeepSeek in the calculator → Visit DeepSeek

DeepSeek has become the reference point for "capable but cheap": its V4 models deliver strong reasoning and coding at a fraction of frontier-API prices, and because the weights are openly available you can also self-host them. That combination — rock-bottom managed pricing or full data residency — makes DeepSeek uniquely interesting for cost-conscious and privacy-first teams. This page covers the lineup, the aggressive cache pricing, the managed-vs-self-hosted decision and a worked example. The table above is from a live feed; the cost calculator shows what your volume would cost.

The DeepSeek lineup

DeepSeek bills per million input and output tokens across two tiers:

V4 Flash — the cheapest, fast tier for everyday work.
V4 Pro — stronger reasoning for harder tasks, still far below frontier-API prices.

Both are priced so low that for many workloads the dominant cost becomes output length rather than the per-token rate itself.

Cache pricing: close to free input

DeepSeek's standout commercial feature is cache pricing. A cache hit on repeated input is billed at an extremely low rate — often two orders of magnitude below the normal input price. For workloads with a large stable prefix (a fixed system prompt, a long reference document), the effective input cost can approach zero on repeated calls. Combined with already-low base prices, this makes DeepSeek one of the cheapest ways to run prompt-heavy, repetitive workloads.

As always, output is the pricier half, so the usual discipline applies: cap response length and reserve the Pro tier for tasks that need it.

A worked example

A RAG assistant sends a 6,000-token retrieved-context block plus a 2,000-token fixed system prompt and a 300-token question, returning 700 output tokens, 100,000 times a month:

input  = 8,300 tokens   (2,000 cacheable + 6,300 dynamic)
output = 700 tokens

Even at full price this is inexpensive on DeepSeek; with cache pricing applied to the 2,000-token system prompt, that portion becomes negligible. Put your own numbers in the calculator and compare the monthly total against a frontier provider — the gap is frequently 10–30×.

Managed API or self-hosted?

DeepSeek sits at an interesting crossroads for privacy-first teams. You can use the managed API at rock-bottom prices, or run the open weights yourself for complete data residency. The decision is the classic one:

Managed API — near-zero marginal cost, no operational burden, scales instantly. But data leaves your infrastructure.
Self-hosted — data never leaves, predictable fixed monthly cost regardless of volume. But you own the GPUs, scaling, batching and incident response, and it only wins economically at high, steady utilisation.

Our self-hosted inference review and self-hosted cost breakdown walk through the maths: estimate your monthly token volume, price the managed option here, then compare against fixed GPU + power + a realistic share of ops time — and redo it at half your expected utilisation to test how fragile the case is. At low or bursty volume, DeepSeek's managed API is usually the cheaper and far simpler answer; at sustained high volume or under strict data-residency rules, self-hosting pulls ahead.

How to cut your DeepSeek bill

Lean on cache pricing — keep a stable, cacheable prefix and the input half nearly disappears.
Cap output — at these prices, output length is often the dominant cost.
Use Flash by default, Pro only where reasoning quality matters.
Right-size context — DeepSeek is cheap, but trimming dynamic context still helps at very high volume.

When to choose DeepSeek

Pick it when you want frontier-adjacent quality at the lowest possible per-token price, when your prompts have a large cacheable prefix, or when you want the option to self-host the same model later for data residency. The quality gap to the big frontier labs has narrowed sharply while the price gap remains large — so test it on your real tasks against OpenAI and Anthropic and let the results decide.

Frequently asked questions

Why is DeepSeek so much cheaper than OpenAI or Anthropic? A combination of efficient model architecture, open weights and aggressive pricing strategy. For suitable tasks the quality is competitive; for the very hardest reasoning the frontier labs may still edge ahead.

How low is the cache-hit price? Very — often around 1–2% of the normal input rate. The bigger and more stable your prefix, the more you save.

Can I run DeepSeek myself? Yes, the weights are open. Whether that's cheaper than the API depends entirely on your utilisation — see the self-hosted cost breakdown linked above.

Is the managed API private enough? It excludes your data from training under its terms, but the data still transits a third party. For "data never leaves our infrastructure", self-host the open weights.

Compare DeepSeek across the full field in the LLM API cost calculator.

Prices are auto-refreshed from a live source and dated. Confirm current pricing on DeepSeek's page before committing.

Model	Context	API price / 1M in → out	Sample cost / request
DeepSeek V4 Flash open-weight	1M	$0.0900 → $0.1800	$0.000540
DeepSeek V3.2 open-weight	131K	$0.2288 → $0.3432	$0.001258
DeepSeek V3.2 Exp open-weight	164K	$0.2700 → $0.4100	$0.001490
DeepSeek V3 0324 open-weight	164K	$0.2000 → $0.7700	$0.001570
DeepSeek V3 open-weight	131K	$0.2002 → $0.8001	$0.001601
DeepSeek V3.1 open-weight	164K	$0.2100 → $0.7900	$0.001630
DeepSeek V3.1 Terminus open-weight	164K	$0.2700 → $0.9500	$0.002030
DeepSeek V4 Pro open-weight	1M	$0.4350 → $0.8700	$0.002610
R1 Distill Llama 70B open-weight	128K	$0.8000 → $0.8000	$0.004000
R1 0528 open-weight	164K	$0.5000 → $2.15	$0.004150
R1 open-weight	164K	$0.7000 → $2.50	$0.005300

DeepSeek API Pricing & Cost Calculator (2026)

The DeepSeek lineup

Cache pricing: close to free input

A worked example

Managed API or self-hosted?

How to cut your DeepSeek bill

When to choose DeepSeek

Frequently asked questions

DeepSeek models & current pricing

Read our DeepSeek review

OpenAI pricing

Anthropic pricing

LLM API Cost Calculator