API Pricing

Google Gemini API Pricing & Cost Calculator (2026)

Current Google Gemini API pricing per million tokens — Pro, Flash and Flash-Lite — with a live calculator, context-tier pricing explained, worked examples and cost tips.

Compare Google in the calculator → Visit Google

Google's Gemini API is often the value leader, especially for high-volume and long-context workloads, thanks to very competitive Flash and Flash-Lite tiers and enormous context windows. It also has a pricing quirk most providers don't — context-tiered pricing — that's worth understanding before you commit. This page explains the lineup, the tier mechanics, a worked example and how to keep it cheap. The table above is from a live feed; use the cost calculator to see your real spend.

The Gemini lineup

Gemini is sold in a clear ladder, billed per million input and output tokens:

The enormous context windows (up to ~1M tokens) make Gemini attractive whenever you genuinely need to put a lot of material in front of the model in one shot — long documents, large codebases, big transcripts.

Context-tiered pricing (the quirk to watch)

On the Pro tier, the per-token price increases once a request crosses a context threshold (around 200k tokens): both input and output cost more above the line. Our dataset records these tiers; the headline figure shown in the table is the standard (≤200k) rate. The practical implication: a workflow that occasionally sends very long contexts can cost noticeably more than the headline suggests. If you routinely operate above the threshold, budget for the higher tier — and consider whether you can stay under it by chunking or summarising.

Flash and Flash-Lite generally use flat pricing, which is part of what makes them so predictable for high-volume work.

Context caching

Gemini bills cached context at a fraction of the input rate, with a separate small storage component for very large cached blocks held over time. For repeated long prompts — the same big document queried many times — this is a meaningful saving on top of already-low Flash pricing.

A worked example

A document-Q&A feature feeds a 50,000-token document plus a 200-token question, and returns a 600-token answer, 30,000 times a month, on Flash:

input  = 50,200 tokens
output = 600 tokens
monthly = ((50,200/1e6 × input) + (600/1e6 × output)) × 30,000

Because the document is identical across many questions, context caching bills it at a fraction after the first call — turning a large input line item into a small one. Run it in the calculator with caching on to see the effect. Note how heavily the input dominates here: with 50k input vs 600 output, this is a workload where Gemini's cheap input tiers and caching shine.

How to cut your Gemini bill

  1. Default to Flash-Lite, escalate to Flash, and only reach for Pro when quality genuinely demands it.
  2. Stay under the context tier threshold where you can — chunk or summarise long inputs to avoid the higher Pro rate.
  3. Use context caching for stable, repeated material like fixed documents or knowledge bases.
  4. Cap output length — as everywhere, output is the pricier half.
  5. Send bulk jobs through the batch tier for the asynchronous discount.

Gemini vs the alternatives

For high-volume, cost-sensitive tasks — classification, extraction, summarisation, simple chat — Flash-Lite is among the cheapest capable options anywhere, rivalled mainly by DeepSeek and open-weight models. At the top end, Pro competes with OpenAI's flagship and Anthropic's Opus and offers the largest practical context, though the context-tier pricing means you should model long-context costs carefully rather than trusting the headline rate.

Frequently asked questions

Why is my Gemini Pro bill higher than the headline price? Almost certainly context-tiered pricing: requests above ~200k tokens are billed at a higher per-token rate. Check whether your prompts cross that threshold.

Is Flash-Lite good enough for production? For classification, extraction, routing and summarisation, usually yes — and at a fraction of frontier prices. Reserve Flash/Pro for tasks where quality clearly improves the outcome.

Does Gemini support prompt/context caching? Yes. Stable repeated context is billed at a reduced rate, with a small storage fee for large cached blocks — worthwhile when the same material is queried many times.

Which is cheaper, Gemini or OpenAI? It depends on the task and model tier. For high-volume simple work Gemini Flash-Lite is typically cheaper; at the frontier they're closer. Compare your exact scenario in the calculator.

Compare Gemini against OpenAI, Anthropic and the open-weight field in the LLM API cost calculator.

Prices are auto-refreshed from a live source and dated. Confirm current pricing on Google's page before committing.

Google models & current pricing

API price = list price per 1M tokens (input → output). Sample cost = one request at 4,000 input + 1,000 output tokens. Use the full calculator for your own volume.

Model Context API price / 1M in → out Sample cost / request
Gemma 3 4B fast 131K $0.0500 $0.1000 $0.000300
Gemma 3 12B fast 131K $0.0500 $0.1500 $0.000350
Gemma 3n 4B fast 33K $0.0600 $0.1200 $0.000360
Gemma 3 27B fast 131K $0.0800 $0.1600 $0.000480
Gemma 4 26B A4B fast 262K $0.0600 $0.3300 $0.000570
Gemini 2.5 Flash Lite Preview 09-2025 fast 1M $0.1000 $0.4000 $0.000800
Gemini 2.5 Flash Lite fast 1M $0.1000 $0.4000 $0.000800
Gemma 4 31B fast 262K $0.1200 $0.3500 $0.000830
Gemini 3.1 Flash Lite fast 1M $0.2500 $1.50 $0.002500
Gemini 3.1 Flash Lite Preview fast 1M $0.2500 $1.50 $0.002500
Gemma 2 27B fast 8K $0.6500 $0.6500 $0.003250
Nano Banana (Gemini 2.5 Flash Image) fast 33K $0.3000 $2.50 $0.003700
Gemini 2.5 Flash fast 1M $0.3000 $2.50 $0.003700
Nano Banana 2 (Gemini 3.1 Flash Image Preview) fast 131K $0.5000 $3.00 $0.005000
Gemini 3 Flash Preview fast 1M $0.5000 $3.00 $0.005000
Gemini 2.5 Pro balanced 1M $1.25 $10.00 $0.0150
Gemini 2.5 Pro Preview 06-05 balanced 1M $1.25 $10.00 $0.0150
Gemini 2.5 Pro Preview 05-06 balanced 1M $1.25 $10.00 $0.0150
Gemini 3.5 Flash balanced 1M $1.50 $9.00 $0.0150
Gemini 3.1 Pro Preview Custom Tools balanced 1M $2.00 $12.00 $0.0200
Gemini 3.1 Pro Preview balanced 1M $2.00 $12.00 $0.0200
Nano Banana Pro (Gemini 3 Pro Image Preview) balanced 66K $2.00 $12.00 $0.0200

Estimates only; confirm current pricing on the provider's page. Prices auto-refresh every 12h.

API Pricing

OpenAI pricing

Up-to-date OpenAI API pricing per million tokens for the GPT-5 family, with a live cost calculator. Input, output, cached and batch pricing, worked examples and how to cut your bill.

API Pricing

Anthropic pricing

Current Anthropic Claude API pricing per million tokens — Opus, Sonnet, Haiku and Fable — with a live cost calculator, prompt-caching maths, worked examples and savings tips.