Comparison · 9 min · 2 citations

Anthropic vs OpenAI API Pricing 2026: Per-Token Cost Compared

Anthropic vs OpenAI API pricing 2026: Claude Sonnet 4.6 is $3/$15 per million tokens, GPT-5.4 is $2.50/$15. Flagship, mini, and cached rates compared.

By AI Biz Hub · Published May 25, 2026 · Updated June 12, 2026

Education · General business information, not legal, tax, or financial advice. Editorial standards Sponsor disclosure Corrections

TL;DR

At the mid tier, OpenAI's GPT-5.4 ($2.50 input / $15 output per million tokens) is marginally cheaper on input than Claude Sonnet 4.6 ($3 / $15), with identical output^[1]^[2]. At the flagship tier, Claude Opus 4.7 ($5 / $25) undercuts GPT-5.5 ($5 / $30) on output. At the cheap tier, GPT-5.4-mini ($0.75 / $4.50) is below Claude Haiku 4.5 ($1 / $5).

The blended winner depends on your input-to-output ratio: output-heavy workloads favor Claude at the flagship tier, input-heavy workloads favor OpenAI's mid tier. Both cut cached input by roughly 90% and offer a 50% batch discount. The per-token spread at the comparable tier is small enough that capability fit usually decides the choice, not price.

Update June 2026: Anthropic's Opus flagship is now Claude Opus 4.8 (released 2026-05-28), which replaces Opus 4.7 at the same $5 / $25 standard rate and 1M context. Every figure and comparison below still holds.

Anthropic and OpenAI price within cents of each other at matched tiers, so the real lever is your input-to-output ratio and how much you can cache. Claude Sonnet 4.6 runs $3 input / $15 output per million tokens; OpenAI's GPT-5.4 runs $2.50 / $15. This article maps the tiers across both vendors, prices the prompt-caching discount, and runs a SaaS margin scenario through the calculator so the cost difference shows up as gross margin rather than abstract per-token rates.

1. The per-token rate table

Standard (non-batch) rates per million tokens, verified against both official pricing pages as of May 25, 2026.

Tier	Anthropic model	Input / Output	OpenAI model	Input / Output
Flagship	Claude Opus 4.7^[1]	$5 / $25	GPT-5.5^[2]	$5 / $30
Mid / workhorse	Claude Sonnet 4.6^[1]	$3 / $15	GPT-5.4^[2]	$2.50 / $15
Cheap / fast	Claude Haiku 4.5^[1]	$1 / $5	GPT-5.4-mini^[2]	$0.75 / $4.50
Cheapest	Claude Haiku 4.5^[1]	$1 / $5	GPT-5.4-nano^[2]	$0.20 / $1.25
Cached input (read)	Sonnet 4.6 cache hit^[1]	$0.30	GPT-5.4 cached^[2]	$0.25
Batch (50% off)	Sonnet 4.6 batch^[1]	$1.50 / $7.50	GPT-5.4 batch^[2]	$1.25 / $7.50

The pattern: OpenAI is slightly lower on input at the mid and cheap tiers, while Anthropic is lower on output at the flagship tier ($25 vs $30). At the rock-bottom end, OpenAI's GPT-5.4-nano ($0.20 / $1.25) is far below anything Anthropic publishes, so high-volume, low-complexity classification favors OpenAI's nano tier.

2. Matching tiers across vendors

Comparing list prices is only fair if the models are comparable in capability. The rough mapping for cost planning:

Flagship reasoning: Claude Opus (4.8 since 2026-05-28, $5/$25, same rate as the 4.7 it replaced) against GPT-5.5. Both top each vendor's lineup. The Opus 4.7 generation shipped a new tokenizer that the Anthropic docs noted may consume up to 35% more tokens for the same text, which can offset its lower per-token output rate; re-check token counts on 4.8 for your prompts rather than assuming the headline rate transfers one to one.
Production workhorse: Claude Sonnet 4.6 against GPT-5.4. This is the tier most SaaS products run on. The $0.50 input gap favors OpenAI, the output is identical, so an input-heavy retrieval workload tilts to GPT-5.4.
Cheap and fast: Claude Haiku 4.5 against GPT-5.4-mini. Close on output; GPT-5.4-mini is cheaper on input.

The tokenizer note matters for Opus 4.7 specifically. A lower per-token rate that consumes more tokens per request can net out higher than a competitor with a higher rate and a leaner tokenizer. The only honest comparison is to run your real prompts through both and compare total dollars, not per-token rates.

3. Prompt caching changes the effective rate

Both vendors discount cached input by roughly 90%. On Anthropic, a cache read costs 0.1x the base input rate, so Sonnet 4.6 cached input is $0.30 per million tokens instead of $3^[1]. Anthropic charges a cache-write premium (1.25x base for a 5-minute cache, 2x for a 1-hour cache), so caching pays off after the first read on the short cache. OpenAI's cached input is similarly discounted, with GPT-5.4 cached input at $0.25 against $2.50 standard^[2].

For an agent that resends a large system prompt, a long document, or conversation history on every call, the cached portion dominates input volume. If 80% of your input tokens are cacheable, the effective input rate on Sonnet 4.6 drops from $3 toward roughly $0.84 per million (0.8 × $0.30 + 0.2 × $3). Caching is the single biggest lever on input cost at both vendors, and it narrows the already-small per-token gap between them.

4. Margin impact on a $29 SaaS

Abstract per-token rates do not communicate the decision. The margin calculator below prices a $29/month SaaS where each user triggers 20 API calls a day at 500 input and 1,000 output tokens, first on Claude Sonnet 4.6 ($3 / $15), then on GPT-5.4 ($2.50 / $15). The engine renders the gross margin from the verified rates:

Show the recompute-verified inputs and outputs

$29 SaaS on Claude Sonnet 4.6 ($3 input / $15 output per MTok)

Inputs
subscription_price	29
avg_api_calls_per_day	20
avg_input_tokens	500
avg_output_tokens	1000
input_cost_per_million	3
output_cost_per_million	15
hosting_cost_per_user	0.5
other_per_user_costs	0.25

Result
api cost per user	9.9
total cost per user	10.65
gross margin per user	18.35
gross margin percent	63.3
api share percent	93
dominant cost driver	AI API
scale tiers › row 1 › users	100
scale tiers › row 1 › total revenue	2900
scale tiers › row 1 › total cost	1065
scale tiers › row 1 › total profit	1835
scale tiers › row 1 › margin percent	63.3
scale tiers › row 2 › users	1000
scale tiers › row 2 › total revenue	29000
scale tiers › row 2 › total cost	10650
scale tiers › row 2 › total profit	18350
scale tiers › row 2 › margin percent	63.3
scale tiers › row 3 › users	10000
scale tiers › row 3 › total revenue	290000
scale tiers › row 3 › total cost	106500
scale tiers › row 3 › total profit	183500
scale tiers › row 3 › margin percent	63.3
insight	AI API is 93% of your per-user cost ($9.9/mo). At 10K users, that is $99000/month on API alone. Caching responses or using a tiered model approach could significantly improve margins.

Computed live at build time.

$29 SaaS on OpenAI GPT-5.4 ($2.50 input / $15 output per MTok)

Inputs
subscription_price	29
avg_api_calls_per_day	20
avg_input_tokens	500
avg_output_tokens	1000
input_cost_per_million	2.5
output_cost_per_million	15
hosting_cost_per_user	0.5
other_per_user_costs	0.25

Result
api cost per user	9.75
total cost per user	10.5
gross margin per user	18.5
gross margin percent	63.8
api share percent	92.9
dominant cost driver	AI API
scale tiers › row 1 › users	100
scale tiers › row 1 › total revenue	2900
scale tiers › row 1 › total cost	1050
scale tiers › row 1 › total profit	1850
scale tiers › row 1 › margin percent	63.8
scale tiers › row 2 › users	1000
scale tiers › row 2 › total revenue	29000
scale tiers › row 2 › total cost	10500
scale tiers › row 2 › total profit	18500
scale tiers › row 2 › margin percent	63.8
scale tiers › row 3 › users	10000
scale tiers › row 3 › total revenue	290000
scale tiers › row 3 › total cost	105000
scale tiers › row 3 › total profit	185000
scale tiers › row 3 › margin percent	63.8
insight	AI API is 92.9% of your per-user cost ($9.75/mo). At 10K users, that is $97500/month on API alone. Caching responses or using a tiered model approach could significantly improve margins.

Computed live at build time.

The two engine runs land within a single point of gross margin of each other. The reason is that output tokens dominate the bill (1,000 output vs 500 input per call, and output is five times the input rate on both), and the output rate is identical at $15. The $0.50 input difference barely moves the per-user economics. The practical read: at the mid tier, the Claude-vs-GPT choice is not a margin decision. It is a capability decision. Pick the model that produces better output on your task, because the cost difference is noise.

The lever that does move margin is in the engine's own insight line: AI API cost is the dominant share of per-user cost in this scenario. Cutting output tokens (tighter prompts, shorter completions) or moving routine calls to the cheap tier (Haiku 4.5 or GPT-5.4-mini) changes margin far more than switching vendors at the same tier. The cheapest LLM API ranking covers the full provider field.

5. Decision guidance

Output-heavy generation at the flagship tier: Claude Opus 4.7 at $25 output undercuts GPT-5.5 at $30, but verify against the Opus tokenizer overhead on your prompts.
Input-heavy retrieval at the mid tier: GPT-5.4 at $2.50 input edges Claude Sonnet 4.6 at $3, with identical output.
High-volume cheap classification: GPT-5.4-nano at $0.20 / $1.25 is far below anything in Anthropic's published lineup.
Heavy cacheable context: both cut cached input ~90%; the decision stays a capability call, not a price call.

Re-verify both pricing pages before committing to a model in production. Per-token rates at this layer move with each model release. For the broader field including open-weight and low-cost providers, see the cheapest LLM API ranking and DeepSeek vs Gemini pricing.

All per-token figures verified against official pricing pages as of 2026-06-12. Anthropic's Opus flagship is now Claude Opus 4.8 (since 2026-05-28) at the same $5/$25 standard rate.

Frequently asked questions

Is Claude or GPT cheaper per token in 2026?

At the mid tier, OpenAI's GPT-5.4 is marginally cheaper on input ($2.50 per million input tokens) than Claude Sonnet 4.6 ($3 per million), with identical $15 output, verified on both official pricing pages as of May 2026. At the flagship tier, Claude Opus 4.7 ($5 input / $25 output) is cheaper on output than GPT-5.5 ($5 input / $30 output). At the cheap tier, GPT-5.4-mini ($0.75 / $4.50) is close to Claude Haiku 4.5 ($1 / $5). The blended winner depends on your input-to-output ratio.

How much does prompt caching reduce cost?

Both vendors discount cached input heavily. Anthropic charges 0.1x the base input rate on a cache read, so a Sonnet 4.6 cache hit is $0.30 per million tokens instead of $3. OpenAI's cached input is also roughly a 90% discount, for example GPT-5.4 cached input at $0.25 against $2.50 standard. For agents that resend a large system prompt every call, caching can cut effective input cost by 80 to 90 percent.

Which has the better batch discount?

Both offer a 50 percent batch discount for asynchronous workloads. Claude Sonnet 4.6 batch is $1.50 input / $7.50 output per million tokens; GPT-5.4 batch is $1.25 input / $7.50 output. The batch path is the cheapest route on both for non-time-sensitive jobs like bulk classification or document processing.

References

Sources

Primary sources only. No vendor-marketing blogs or aggregated secondary claims.

1 Anthropic — Claude API pricing (Opus 4.8 flagship, Sonnet 4.6, Haiku 4.5 per-MTok rates, caching, batch) — accessed 2026-06-12
2 OpenAI — API pricing (GPT-5.5, GPT-5.4, GPT-5.4-mini, GPT-5.4-nano per-MTok rates, cached input, batch) — accessed 2026-06-12

Tools referenced in this article

Run the Numbers

AI Product Margin Calculator

Calculate per-user margin for AI products from subscription price, API token costs, hosting, and per-user expenses.

9 min

Mistral vs OpenAI API Pricing 2026: Per-Token Cost Compared

Mistral vs OpenAI API pricing 2026: Mistral Large 3 is $0.50/$1.50 per million tokens, GPT-5.4 is $2.50/$15. EU data residency and the cost gap compared.

11 min

Cheapest LLM API 2026: 8 Providers Ranked by Blended Cost

Cheapest LLM API 2026: DeepSeek V4-flash and Gemini 2.5 Flash-Lite lead at ~$0.18 blended per million tokens. Eight providers ranked at a 3:1 token mix.