Skip to main content
aibizhub

Comparison · 8 min · 3 citations

AI Support Break-Even: Deflection Rate vs Ticket Volume

AI support break-even by deflection rate: engine-computed $8,655/mo saved at 70% vs $2,355 at 40%, and why token cost is not the variable.

By AI Biz Hub · Published May 26, 2026

Education · General business information, not legal, tax, or financial advice. Editorial standards Sponsor disclosure Corrections

TL;DR

An AI support desk's payoff is set by its deflection rate, not its token cost. On 3,000 tickets a month at 10 minutes per human ticket and a $30/hour loaded cost, the AI vs Human Support Cost engine returns a $8,655/month saving (57.7% lower, $2.12 vs $5.00 per ticket) at 70% deflection, and a $2,355/month saving (15.7%) at 40% deflection.

The token cost of an AI-resolved ticket is a fraction of a cent, so the model price barely matters. The whole AI-first bill is the human handling of escalated tickets plus the escalation overhead. Raise deflection to move the number; switching to a cheaper model does almost nothing.

The pitch for AI customer support is always "deflect tickets, save money," but the saving depends almost entirely on one rate the pitch glosses over: how many tickets the AI actually resolves without a human. This article runs the same 3,000-ticket-a-month desk at two deflection rates, holds everything else constant, and shows that the deflection rate — not the token price, not the model — is what decides whether the desk pays off. Every number is rendered live from the shipped engine bundle and recomputed in continuous integration; the model rate is a list price from the named vendor page, accessed 2026-05-26.

1. The deflection-rate model

The desk handles 3,000 tickets a month. A human ticket takes 10 minutes at a $30/hour loaded cost[2], so human-only support costs $5.00 per ticket and $15,000/month. An AI-first desk resolves a share of tickets at token cost only — 5,000 tokens per resolution on a Haiku-tier model at $1 input and $5 output per million tokens[1] — and escalates the rest to a human, who handles them at the full 10 minutes plus a 4-minute escalation overhead. The only input that changes between the two runs below is the deflection rate.

2. 70% deflection: $8,655/month saved

Show the recompute-verified inputs and outputs
3,000 tickets/mo at 70% deflection, Haiku-tier token cost, 4-min escalation overhead
Inputs
tickets_per_month 3000
avg_human_minutes_per_ticket 10
human_hourly_cost 30
ai_resolution_rate 70
tokens_per_ai_resolved 5000
ai_input_price_per_mtok 1
ai_output_price_per_mtok 5
escalation_overhead_minutes 4
Result
human only monthly cost 15000
cost per human ticket 5
ai first monthly cost 6345
cost per ai ticket 2.12
monthly savings 8655
savings percent 57.7
break even tickets 1
ai resolved count 2100
escalated count 900

Computed live at build time.

At 70% deflection the engine returns a human-only cost of $15,000/month against an AI-first cost of $6,345/month — a $8,655 monthly saving, 57.7% lower. The per-ticket cost drops from $5.00 to $2.12. Of 3,000 tickets, 2,100 are AI-resolved at token cost and 900 escalate to humans. The saving is large because 2,100 tickets that would have cost $5.00 each in human time now cost a fraction of a cent in tokens, and only the 900 escalations carry real labor.

3. 40% deflection: the saving collapses

Show the recompute-verified inputs and outputs
3,000 tickets/mo at 40% deflection, identical token and overhead inputs
Inputs
tickets_per_month 3000
avg_human_minutes_per_ticket 10
human_hourly_cost 30
ai_resolution_rate 40
tokens_per_ai_resolved 5000
ai_input_price_per_mtok 1
ai_output_price_per_mtok 5
escalation_overhead_minutes 4
Result
human only monthly cost 15000
cost per human ticket 5
ai first monthly cost 12645
cost per ai ticket 4.22
monthly savings 2355
savings percent 15.7
break even tickets 1
ai resolved count 1200
escalated count 1800

Computed live at build time.

Drop deflection to 40% and the AI-first cost rises to $12,645/month — a saving of only $2,355, or 15.7%. Now 1,800 of the 3,000 tickets escalate, each carrying the full 10 minutes of human time plus the 4-minute overhead. The escalated volume nearly doubled versus the 70% case, and with it the labor bill. The same desk, the same model, the same token price — only the deflection rate changed, and the saving fell by 73%. This is the variable that decides the business case.

4. Token cost is not the variable that matters

It is tempting to optimize the model: switch to a cheaper tier, cut tokens per resolution. The math says do not bother. At 5,000 tokens per resolution and $1/$5 per million tokens, the token cost of an AI-resolved ticket is well under a cent — invisible against a $5.00 human ticket. Even halving the token cost or the model price changes the monthly bill by a few dollars. The entire AI-first cost is human labor on escalated tickets. Founders who tune the model to save on support are optimizing a line that rounds to zero; the leverage is entirely in the resolution rate.

5. The escalation overhead is the hidden tax

Escalated tickets are more expensive than they would have been on a human-only desk, because the AI handoff adds overhead — here 4 minutes on top of the 10-minute base, a 40% surcharge on every escalation. At high deflection that tax applies to a small slice of volume and barely registers; at low deflection it applies to most tickets and is a real drag. This is why a poorly tuned AI desk can feel like it adds cost: if it deflects little and escalates clumsily, you pay token cost plus a labor surcharge on top of work you would have done anyway. The fix is the same one variable: raise deflection, which both shrinks the escalated count and shrinks the total overhead.

6. The break-even rule for an AI support desk

The engine confirms AI per-ticket cost drops below human per-ticket cost at almost any positive deflection rate, because the resolved tickets are nearly free — so an AI desk is technically break-even early. But the business decision is not break-even, it is whether the saving justifies the setup, and that needs real deflection. The rule: only build an AI support desk for a ticket category where you can reasonably expect 60%-plus deflection, instrument the deflection rate as your single key metric, and minimize escalation overhead with clean handoffs. Do not spend effort on model selection for cost reasons — pick a capable mid-tier model and put all the optimization into deflection. Re-run the support cost engine with your own ticket volume and deflection estimate, and see the full operating picture in the AI Micro-SaaS Unit Economics Report[3].

Frequently asked questions

What deflection rate does an AI support desk need to pay off?

It pays off well below 70% but the saving scales steeply with deflection. The AI vs Human Support Cost engine returns a $8,655/month saving (57.7% lower) at 70% deflection on 3,000 tickets a month, and a $2,355/month saving (15.7%) at 40% deflection. The break-even where AI per-ticket cost drops below human per-ticket cost is reached at almost any positive deflection rate, because the token cost of resolution is trivial; the question is whether the saving is large enough to justify the setup.

How much can AI customer support save per month?

On 3,000 tickets a month at 10 minutes per human ticket and a $30/hour loaded cost, the engine returns a human-only cost of $15,000/month. At 70% deflection the AI-first desk costs $6,345/month for a $8,655 saving; at 40% deflection it costs $12,645/month for a $2,355 saving. The per-ticket cost falls from $5.00 human-only to $2.12 at 70% deflection.

Is the AI token cost a big part of support cost?

No. At 5,000 tokens per AI-resolved ticket on a Haiku-tier model ($1 input, $5 output per million tokens), the token cost per resolution is a fraction of a cent. The entire AI-first cost is dominated by the human handling of the tickets the AI escalates, plus the escalation overhead minutes. Picking a cheaper model barely moves the support bill; raising the deflection rate moves it a lot.

Why does escalation overhead matter in AI support cost?

Because escalated tickets cost more than they would have under a human-only desk. An escalated ticket carries the full original human handling time plus the escalation overhead the AI handoff adds — in this model 4 extra minutes on top of the 10-minute base. At low deflection most tickets escalate, so the overhead is applied to a large share of volume and erodes the saving. High deflection both shrinks the escalated count and shrinks the total overhead bill.

References

Sources

Primary sources only. No vendor-marketing blogs or aggregated secondary claims.

  1. 1 Anthropic — API pricing (Claude Haiku 4.5 per-million rates for AI resolution cost) — accessed 2026-05-26
  2. 2 U.S. Bureau of Labor Statistics — Customer service representatives wage data (loaded-cost basis) — accessed 2026-05-26
  3. 3 AI Biz Hub — AI vs Human Support Cost methodology — accessed 2026-05-26

Tools referenced in this article

Business planning estimates — not legal, tax, or accounting advice.