Comparison · 9 min · 2 citations
DeepSeek vs Gemini API Pricing 2026: Per-Token Cost Compared
DeepSeek vs Gemini API pricing 2026: DeepSeek V4-flash is $0.14/$0.28 per million tokens, Gemini 3.5 Flash is $1.50/$9.00. The low-cost frontier vs Google.
DeepSeek is far cheaper at the workhorse tier. DeepSeek V4-flash is $0.14 input (cache miss) / $0.28 output per million tokens; the nearest Gemini tier, Gemini 3.5 Flash, is $1.50 / $9.00[1][2]. On output, DeepSeek V4-flash is roughly 32x cheaper. Google's cheapest, Gemini 2.5 Flash-Lite at $0.10 / $0.40, is the closest competitor.
DeepSeek wins on raw text cost; Gemini buys data residency, documented long-context pricing tiers, multimodal input, and Google Cloud integration. One caveat: the DeepSeek docs flag a 75% promo on the V4-pro model ending 2026-05-31, after which V4-pro moves to one quarter of the original rate. Verify the V4-pro number after that date.
DeepSeek and Gemini sit at opposite ends of the cost-versus-platform trade. DeepSeek V4-flash prices text at $0.14 input / $0.28 output per million tokens, undercutting Gemini 3.5 Flash ($1.50 / $9.00) by an order of magnitude on output. This article lays out the verified rates, flags DeepSeek's expiring promo and Gemini's context-tier pricing, and runs both through the margin calculator so the cost gap shows up as gross margin on a real SaaS.
1. The per-token rate table
Standard rates per million tokens, verified against both official pricing pages as of May 25, 2026.
| Tier | DeepSeek model | Input / Output | Gemini model | Input / Output |
|---|---|---|---|---|
| Workhorse | DeepSeek V4-flash[1] | $0.14 / $0.28 | Gemini 3.5 Flash[2] | $1.50 / $9.00 |
| Cheapest tier | DeepSeek V4-flash[1] | $0.14 / $0.28 | Gemini 2.5 Flash-Lite[2] | $0.10 / $0.40 |
| Cache hit (input) | V4-flash cache hit[1] | $0.0028 | Gemini 3.5 Flash cached[2] | $0.15 |
| Reasoning / pro | DeepSeek V4-pro (promo)[1] | $0.435 / $0.87 | Gemini 3.1 Pro (≤200k)[2] | $2.00 / $12.00 |
| Pro, long context | Not publicly documented (as of May 2026) | — | Gemini 3.1 Pro (>200k)[2] | $4.00 / $18.00 |
The headline gap is on output. DeepSeek V4-flash output at $0.28 against Gemini 3.5 Flash output at $9.00 is a 32x difference. Even against Google's cheapest tier (Gemini 2.5 Flash-Lite at $0.40 output), DeepSeek V4-flash is still below on output and roughly comparable on input. The cache-hit rate on DeepSeek V4-flash ($0.0028 input) is extreme: a fully cached prompt costs almost nothing.
2. DeepSeek's cache and promo structure
Two DeepSeek details change the real cost. First, the cache-hit input rate of $0.0028 per million tokens is 50x cheaper than the cache-miss rate of $0.14[1]. For workloads that resend a large stable context, the effective input cost collapses toward the cache-hit rate. Second, the DeepSeek docs note a 75% promotional discount on the V4-pro model ending 2026-05-31 15:59 UTC, with V4-pro pricing then adjusting to one quarter of the original rate[1].
The implication: the V4-pro rates quoted here ($0.435 input / $0.87 output) are promotional and time-bounded. The V4-flash rates ($0.14 / $0.28) are the standard published numbers. Because the post-promotion V4-pro rate is a stated future adjustment rather than a currently-billed price, verify it directly after May 31 before building cost projections on the reasoning tier. The deepseek-chat and deepseek-reasoner names are documented as deprecated aliases folded into V4-flash and V4-pro.
3. Gemini's tier ladder and context tiers
Gemini publishes a deeper tier ladder with context-dependent pricing[2]. Gemini 3.1 Pro charges $2.00 input / $12.00 output at or below 200k-token prompts, and $4.00 / $18.00 above 200k. Gemini 2.5 Pro has the same split shape ($1.25 / $10.00 at or below 200k, $2.50 / $15.00 above). The cheap tiers (2.5 Flash-Lite at $0.10 / $0.40, 3.1 Flash-Lite at $0.25 / $1.50) are where Gemini comes closest to DeepSeek on cost.
What the Gemini premium buys over DeepSeek: documented long-context pricing tiers rather than a flat rate, multimodal input priced per modality (text, image, video, audio), and Google Cloud data-residency and compliance options. For a text-only RAG product the DeepSeek cost advantage is decisive. For a multimodal product, or one with EU data-residency requirements, Gemini's platform breadth can justify the higher per-token rate.
4. Margin impact on a $19 SaaS
The margin calculator prices a $19/month SaaS where each user triggers 40 API calls a day at 800 input and 600 output tokens, first on DeepSeek V4-flash ($0.14 / $0.28), then on Gemini 3.5 Flash ($1.50 / $9.00). The engine renders gross margin from the verified rates:
Show the recompute-verified inputs and outputs
| subscription_price | 19 |
|---|---|
| avg_api_calls_per_day | 40 |
| avg_input_tokens | 800 |
| avg_output_tokens | 600 |
| input_cost_per_million | 0.14 |
| output_cost_per_million | 0.28 |
| hosting_cost_per_user | 0.5 |
| other_per_user_costs | 0.25 |
| api cost per user | 0.34 |
|---|---|
| total cost per user | 1.09 |
| gross margin per user | 17.91 |
| gross margin percent | 94.3 |
| api share percent | 31.2 |
| dominant cost driver | Hosting |
| scale tiers › row 1 › users | 100 |
| scale tiers › row 1 › total revenue | 1900 |
| scale tiers › row 1 › total cost | 109 |
| scale tiers › row 1 › total profit | 1791 |
| scale tiers › row 1 › margin percent | 94.3 |
| scale tiers › row 2 › users | 1000 |
| scale tiers › row 2 › total revenue | 19000 |
| scale tiers › row 2 › total cost | 1090 |
| scale tiers › row 2 › total profit | 17910 |
| scale tiers › row 2 › margin percent | 94.3 |
| scale tiers › row 3 › users | 10000 |
| scale tiers › row 3 › total revenue | 190000 |
| scale tiers › row 3 › total cost | 10900 |
| scale tiers › row 3 › total profit | 179100 |
| scale tiers › row 3 › margin percent | 94.3 |
| insight | 94.3% gross margin is healthy. Hosting is your largest cost at $0.5/user/month. At 10K users you keep $179100/month after per-user costs. |
Computed live at build time.
| subscription_price | 19 |
|---|---|
| avg_api_calls_per_day | 40 |
| avg_input_tokens | 800 |
| avg_output_tokens | 600 |
| input_cost_per_million | 1.5 |
| output_cost_per_million | 9 |
| hosting_cost_per_user | 0.5 |
| other_per_user_costs | 0.25 |
| api cost per user | 7.92 |
|---|---|
| total cost per user | 8.67 |
| gross margin per user | 10.33 |
| gross margin percent | 54.4 |
| api share percent | 91.3 |
| dominant cost driver | AI API |
| scale tiers › row 1 › users | 100 |
| scale tiers › row 1 › total revenue | 1900 |
| scale tiers › row 1 › total cost | 867 |
| scale tiers › row 1 › total profit | 1033 |
| scale tiers › row 1 › margin percent | 54.4 |
| scale tiers › row 2 › users | 1000 |
| scale tiers › row 2 › total revenue | 19000 |
| scale tiers › row 2 › total cost | 8670 |
| scale tiers › row 2 › total profit | 10330 |
| scale tiers › row 2 › margin percent | 54.4 |
| scale tiers › row 3 › users | 10000 |
| scale tiers › row 3 › total revenue | 190000 |
| scale tiers › row 3 › total cost | 86700 |
| scale tiers › row 3 › total profit | 103300 |
| scale tiers › row 3 › margin percent | 54.4 |
| insight | AI API is 91.3% of your per-user cost ($7.92/mo). At 10K users, that is $79200/month on API alone. Caching responses or using a tiered model approach could significantly improve margins. |
Computed live at build time.
The gap between the two engine runs is large: the DeepSeek run holds a high gross margin because per-user API cost is a few cents, while the Gemini 3.5 Flash run gives up a substantial share of the $19 price to API cost at this call volume. At 40 calls a day per user with 600 output tokens each, the output rate difference ($0.28 vs $9.00) dominates everything. For a high-call-volume product on a low subscription price, DeepSeek is the difference between a healthy margin and a thin one.
The honest caveat: the calculator prices cost, not capability. If Gemini 3.5 Flash produces materially better output on your task, the higher cost may pay for itself in retention. But on raw unit economics at this scale, the verified rates make DeepSeek the cost frontier. To compare against Western flagships, see Anthropic vs OpenAI pricing and the full cheapest LLM API ranking.
5. Decision guidance
- Text-only, high call volume, cost-sensitive: DeepSeek V4-flash. The output rate of $0.28 against Gemini 3.5 Flash's $9.00 decides it.
- Multimodal (image, video, audio): Gemini. DeepSeek's published text rates do not cover those modalities the way Gemini's per-modality pricing does.
- EU data residency or Google Cloud integration: Gemini, where data-residency options are documented.
- Reasoning workloads on a budget: price DeepSeek V4-pro, but re-verify after the 2026-05-31 promo end before committing.
Re-verify both pricing pages before committing, and specifically re-check the DeepSeek V4-pro rate after May 31, 2026. For the full provider field including open-weight hosts, see the cheapest LLM API ranking.
All per-token figures verified against official pricing pages as of 2026-05-25.
Frequently asked questions
Is DeepSeek or Gemini cheaper in 2026?
DeepSeek is dramatically cheaper at the workhorse tier. DeepSeek V4-flash is $0.14 per million input tokens (cache miss) and $0.28 output, verified on the DeepSeek pricing docs as of May 2026. The nearest Gemini tier, Gemini 3.5 Flash, is $1.50 input and $9.00 output. On output specifically, DeepSeek V4-flash is roughly 32 times cheaper than Gemini 3.5 Flash. Google's cheapest tier, Gemini 2.5 Flash-Lite at $0.10 / $0.40, is closer but still above DeepSeek on output.
Does DeepSeek have a discount that is ending?
Yes. The DeepSeek pricing docs note a 75 percent promotional discount on the V4-pro model ending 2026-05-31 15:59 UTC, after which V4-pro pricing is set to one quarter of the original rate. The V4-flash rates of $0.14 input / $0.28 output are the standard published rates. Verify the V4-pro rate directly after May 31 because the post-promotion number is a stated future adjustment.
Why would I pay more for Gemini over DeepSeek?
Gemini buys Google's data-residency options, native long-context tiers up to and beyond 200k tokens with documented per-tier pricing, multimodal input across text, image, video, and audio, and integration with Google Cloud. DeepSeek wins on raw per-token cost for text. The choice is cost-frontier versus platform-and-multimodal breadth, not a pure price decision.
References
Sources
Primary sources only. No vendor-marketing blogs or aggregated secondary claims.
- 1 DeepSeek — API pricing (V4-flash, V4-pro cache-hit/miss and output rates, promo end date) — accessed 2026-05-25
- 2 Google — Gemini API pricing (3.5 Flash, 3.1 Pro, 3.1 Flash-Lite, 2.5 series per-MTok rates) — accessed 2026-05-25
Tools referenced in this article