API Pricing

Google Gemini API Pricing & Cost Calculator (2026)

Current Google Gemini API pricing per million tokens — Pro, Flash and Flash-Lite — with a live calculator, context-tier pricing explained, worked examples and cost tips.

Updated 2026-06-17 · prices auto-refreshed

Compare Google in the calculator → Visit Google

Google's Gemini API is often the value leader, especially for high-volume and long-context workloads, thanks to very competitive Flash and Flash-Lite tiers and enormous context windows. It also has a pricing quirk most providers don't — context-tiered pricing — that's worth understanding before you commit. This page explains the lineup, the tier mechanics, a worked example and how to keep it cheap. The table above is from a live feed; use the cost calculator to see your real spend.

The Gemini lineup

Gemini is sold in a clear ladder, billed per million input and output tokens:

Pro — frontier reasoning, the largest practical context, and the only tier with context-tiered pricing.
Flash — the balanced, fast default: strong quality at a fraction of Pro's price.
Flash-Lite — the cheapest tier, excellent for classification, extraction and summarisation at scale.

The enormous context windows (up to ~1M tokens) make Gemini attractive whenever you genuinely need to put a lot of material in front of the model in one shot — long documents, large codebases, big transcripts.

Context-tiered pricing (the quirk to watch)

On the Pro tier, the per-token price increases once a request crosses a context threshold (around 200k tokens): both input and output cost more above the line. Our dataset records these tiers; the headline figure shown in the table is the standard (≤200k) rate. The practical implication: a workflow that occasionally sends very long contexts can cost noticeably more than the headline suggests. If you routinely operate above the threshold, budget for the higher tier — and consider whether you can stay under it by chunking or summarising.

Flash and Flash-Lite generally use flat pricing, which is part of what makes them so predictable for high-volume work.

Context caching

Gemini bills cached context at a fraction of the input rate, with a separate small storage component for very large cached blocks held over time. For repeated long prompts — the same big document queried many times — this is a meaningful saving on top of already-low Flash pricing.

A worked example

A document-Q&A feature feeds a 50,000-token document plus a 200-token question, and returns a 600-token answer, 30,000 times a month, on Flash:

input  = 50,200 tokens
output = 600 tokens
monthly = ((50,200/1e6 × input) + (600/1e6 × output)) × 30,000

Because the document is identical across many questions, context caching bills it at a fraction after the first call — turning a large input line item into a small one. Run it in the calculator with caching on to see the effect. Note how heavily the input dominates here: with 50k input vs 600 output, this is a workload where Gemini's cheap input tiers and caching shine.

How to cut your Gemini bill

Default to Flash-Lite, escalate to Flash, and only reach for Pro when quality genuinely demands it.
Stay under the context tier threshold where you can — chunk or summarise long inputs to avoid the higher Pro rate.
Use context caching for stable, repeated material like fixed documents or knowledge bases.
Cap output length — as everywhere, output is the pricier half.
Send bulk jobs through the batch tier for the asynchronous discount.

Gemini vs the alternatives

For high-volume, cost-sensitive tasks — classification, extraction, summarisation, simple chat — Flash-Lite is among the cheapest capable options anywhere, rivalled mainly by DeepSeek and open-weight models. At the top end, Pro competes with OpenAI's flagship and Anthropic's Opus and offers the largest practical context, though the context-tier pricing means you should model long-context costs carefully rather than trusting the headline rate.

Frequently asked questions

Why is my Gemini Pro bill higher than the headline price? Almost certainly context-tiered pricing: requests above ~200k tokens are billed at a higher per-token rate. Check whether your prompts cross that threshold.

Is Flash-Lite good enough for production? For classification, extraction, routing and summarisation, usually yes — and at a fraction of frontier prices. Reserve Flash/Pro for tasks where quality clearly improves the outcome.

Does Gemini support prompt/context caching? Yes. Stable repeated context is billed at a reduced rate, with a small storage fee for large cached blocks — worthwhile when the same material is queried many times.

Which is cheaper, Gemini or OpenAI? It depends on the task and model tier. For high-volume simple work Gemini Flash-Lite is typically cheaper; at the frontier they're closer. Compare your exact scenario in the calculator.

Compare Gemini against OpenAI, Anthropic and the open-weight field in the LLM API cost calculator.

Prices are auto-refreshed from a live source and dated. Confirm current pricing on Google's page before committing.

Model	Context	API price / 1M in → out	Sample cost / request
Gemma 3 4B fast	131K	$0.0500 → $0.1000	$0.000300
Gemma 3 12B fast	131K	$0.0500 → $0.1500	$0.000350
Gemma 3n 4B fast	33K	$0.0600 → $0.1200	$0.000360
Gemma 3 27B fast	131K	$0.0800 → $0.1600	$0.000480
Gemma 4 26B A4B fast	262K	$0.0600 → $0.3300	$0.000570
Gemini 2.5 Flash Lite Preview 09-2025 fast	1M	$0.1000 → $0.4000	$0.000800
Gemini 2.5 Flash Lite fast	1M	$0.1000 → $0.4000	$0.000800
Gemma 4 31B fast	262K	$0.1200 → $0.3500	$0.000830
Gemini 3.1 Flash Lite fast	1M	$0.2500 → $1.50	$0.002500
Gemini 3.1 Flash Lite Preview fast	1M	$0.2500 → $1.50	$0.002500
Gemma 2 27B fast	8K	$0.6500 → $0.6500	$0.003250
Nano Banana (Gemini 2.5 Flash Image) fast	33K	$0.3000 → $2.50	$0.003700
Gemini 2.5 Flash fast	1M	$0.3000 → $2.50	$0.003700
Nano Banana 2 (Gemini 3.1 Flash Image Preview) fast	131K	$0.5000 → $3.00	$0.005000
Gemini 3 Flash Preview fast	1M	$0.5000 → $3.00	$0.005000
Gemini 2.5 Pro balanced	1M	$1.25 → $10.00	$0.0150
Gemini 2.5 Pro Preview 06-05 balanced	1M	$1.25 → $10.00	$0.0150
Gemini 2.5 Pro Preview 05-06 balanced	1M	$1.25 → $10.00	$0.0150
Gemini 3.5 Flash balanced	1M	$1.50 → $9.00	$0.0150
Gemini 3.1 Pro Preview Custom Tools balanced	1M	$2.00 → $12.00	$0.0200
Gemini 3.1 Pro Preview balanced	1M	$2.00 → $12.00	$0.0200
Nano Banana Pro (Gemini 3 Pro Image Preview) balanced	66K	$2.00 → $12.00	$0.0200

Google Gemini API Pricing & Cost Calculator (2026)

The Gemini lineup

Context-tiered pricing (the quirk to watch)

Context caching

A worked example

How to cut your Gemini bill

Gemini vs the alternatives

Frequently asked questions

Google models & current pricing

OpenAI pricing

Anthropic pricing

LLM API Cost Calculator