API Pricing

DeepSeek API Pricing & Cost Calculator (2026)

Current DeepSeek API pricing per million tokens for V4 Pro and V4 Flash, with a live calculator, cache-pricing maths, a self-hosting comparison and worked examples.

Compare DeepSeek in the calculator → Visit DeepSeek

DeepSeek has become the reference point for "capable but cheap": its V4 models deliver strong reasoning and coding at a fraction of frontier-API prices, and because the weights are openly available you can also self-host them. That combination — rock-bottom managed pricing or full data residency — makes DeepSeek uniquely interesting for cost-conscious and privacy-first teams. This page covers the lineup, the aggressive cache pricing, the managed-vs-self-hosted decision and a worked example. The table above is from a live feed; the cost calculator shows what your volume would cost.

The DeepSeek lineup

DeepSeek bills per million input and output tokens across two tiers:

Both are priced so low that for many workloads the dominant cost becomes output length rather than the per-token rate itself.

Cache pricing: close to free input

DeepSeek's standout commercial feature is cache pricing. A cache hit on repeated input is billed at an extremely low rate — often two orders of magnitude below the normal input price. For workloads with a large stable prefix (a fixed system prompt, a long reference document), the effective input cost can approach zero on repeated calls. Combined with already-low base prices, this makes DeepSeek one of the cheapest ways to run prompt-heavy, repetitive workloads.

As always, output is the pricier half, so the usual discipline applies: cap response length and reserve the Pro tier for tasks that need it.

A worked example

A RAG assistant sends a 6,000-token retrieved-context block plus a 2,000-token fixed system prompt and a 300-token question, returning 700 output tokens, 100,000 times a month:

input  = 8,300 tokens   (2,000 cacheable + 6,300 dynamic)
output = 700 tokens

Even at full price this is inexpensive on DeepSeek; with cache pricing applied to the 2,000-token system prompt, that portion becomes negligible. Put your own numbers in the calculator and compare the monthly total against a frontier provider — the gap is frequently 10–30×.

Managed API or self-hosted?

DeepSeek sits at an interesting crossroads for privacy-first teams. You can use the managed API at rock-bottom prices, or run the open weights yourself for complete data residency. The decision is the classic one:

Our self-hosted inference review and self-hosted cost breakdown walk through the maths: estimate your monthly token volume, price the managed option here, then compare against fixed GPU + power + a realistic share of ops time — and redo it at half your expected utilisation to test how fragile the case is. At low or bursty volume, DeepSeek's managed API is usually the cheaper and far simpler answer; at sustained high volume or under strict data-residency rules, self-hosting pulls ahead.

How to cut your DeepSeek bill

  1. Lean on cache pricing — keep a stable, cacheable prefix and the input half nearly disappears.
  2. Cap output — at these prices, output length is often the dominant cost.
  3. Use Flash by default, Pro only where reasoning quality matters.
  4. Right-size context — DeepSeek is cheap, but trimming dynamic context still helps at very high volume.

When to choose DeepSeek

Pick it when you want frontier-adjacent quality at the lowest possible per-token price, when your prompts have a large cacheable prefix, or when you want the option to self-host the same model later for data residency. The quality gap to the big frontier labs has narrowed sharply while the price gap remains large — so test it on your real tasks against OpenAI and Anthropic and let the results decide.

Frequently asked questions

Why is DeepSeek so much cheaper than OpenAI or Anthropic? A combination of efficient model architecture, open weights and aggressive pricing strategy. For suitable tasks the quality is competitive; for the very hardest reasoning the frontier labs may still edge ahead.

How low is the cache-hit price? Very — often around 1–2% of the normal input rate. The bigger and more stable your prefix, the more you save.

Can I run DeepSeek myself? Yes, the weights are open. Whether that's cheaper than the API depends entirely on your utilisation — see the self-hosted cost breakdown linked above.

Is the managed API private enough? It excludes your data from training under its terms, but the data still transits a third party. For "data never leaves our infrastructure", self-host the open weights.

Compare DeepSeek across the full field in the LLM API cost calculator.

Prices are auto-refreshed from a live source and dated. Confirm current pricing on DeepSeek's page before committing.

DeepSeek models & current pricing

API price = list price per 1M tokens (input → output). Sample cost = one request at 4,000 input + 1,000 output tokens. Use the full calculator for your own volume.

Model Context API price / 1M in → out Sample cost / request
DeepSeek V4 Flash open-weight 1M $0.0900 $0.1800 $0.000540
DeepSeek V3.2 open-weight 131K $0.2288 $0.3432 $0.001258
DeepSeek V3.2 Exp open-weight 164K $0.2700 $0.4100 $0.001490
DeepSeek V3 0324 open-weight 164K $0.2000 $0.7700 $0.001570
DeepSeek V3 open-weight 131K $0.2002 $0.8001 $0.001601
DeepSeek V3.1 open-weight 164K $0.2100 $0.7900 $0.001630
DeepSeek V3.1 Terminus open-weight 164K $0.2700 $0.9500 $0.002030
DeepSeek V4 Pro open-weight 1M $0.4350 $0.8700 $0.002610
R1 Distill Llama 70B open-weight 128K $0.8000 $0.8000 $0.004000
R1 0528 open-weight 164K $0.5000 $2.15 $0.004150
R1 open-weight 164K $0.7000 $2.50 $0.005300

Estimates only; confirm current pricing on the provider's page. Prices auto-refresh every 12h.

API Pricing

OpenAI pricing

Up-to-date OpenAI API pricing per million tokens for the GPT-5 family, with a live cost calculator. Input, output, cached and batch pricing, worked examples and how to cut your bill.

API Pricing

Anthropic pricing

Current Anthropic Claude API pricing per million tokens — Opus, Sonnet, Haiku and Fable — with a live cost calculator, prompt-caching maths, worked examples and savings tips.