Speed Run to Legacy: How Cheap AI Tokens Hide Expensive Mistakes

What we’re all seeing

There's a familiar cycle that’s formed in software teams recently. AI is helping them ship faster, but they’re skipping the part where someone actually understands the code. The codebase quality degrades, AI becomes less effective against the mess, and so they lean on it harder to compensate.

This process has been repeating on a loop almost out in the open for months now as software teams (and their bosses) realise just how much faster they can ship when AI is generating their code for them, but there’s a problem they’re not seeing.

Messy codebases are a token multiplier

Agentic AI workflows don't just generate code. They read files, search codebases, retry when they fail, and re-send the full conversation history with every turn. A developer tracking 42 agent runs found 70% of tokens consumed were wasted on things like reading too many files, exploring irrelevant paths, or repeating searches already done.

Improving only the quality of context can cut token consumption dramatically for the same model and same tasks. Research confirms that agents produce more errors and waste more tokens in unhealthy code, and a study of agent-generated patches found they performed measurably better on cleaner, simpler codebases.

The token trap: a vicious cycle where delivery pressure leads to AI-driven development, quality is neglected, the codebase degrades, token needs increase, and the cycle repeats

The industry’s current focus on shipping as fast as possible means we’re sleepwalking into a situation where our codebases are getting messier while the effective token cost per useful task is also going up, despite per unit token costs falling dramatically. Gartner warned product leaders about this directly: “As commoditized intelligence trends toward near-zero cost, the compute and systems needed to support advanced reasoning remain scarce. CPOs who mask architectural inefficiencies with cheap tokens today will find agentic scale elusive tomorrow.”

Building a dependency at subsidised prices

The situations gets worse when you remember that AI labs are losing money on every complex query you send them. OpenAI is projecting $14 billion in losses for 2026, with cumulative losses of $44 billion before expected profitability in 2029. Current AI pricing often doesn't even cover the marginal cost of compute, because every major provider is prioritising ecosystem lock-in over profit.

This is the Uber playbook: subsidise the habit, then reprice once everyone’s dependent. Our dependency is growing fast, too. Enterprise AI bills have already tripled in the past year, despite token prices falling by 99.7% between 2023 and 2025. Companies are consuming vastly more despite the price drops.

When the subsidy ends, the trap springs shut

It’s tempting to argue that models will improve at understanding messy codebases faster than codebases degrade. But so far each new generation of frontier models has also expanded what teams attempt to do with AI, increasing consumption (and average codebase complexity) faster than efficiency gains reduce it.

When the subsidy eventually ends, those hit hardest will be the ones whose codebases demand the most tokens. If your team never ensured the code and architecture AI produced was understood by their developers, then you won’t just be able to reduce your dependence on it when the bill comes due.

Everything in the balance

The way out of this mess is to double down on codebase quality and comprehension. Use AI to speed up development, but not at the expense of your standards or understanding. You don’t need critical oversight of every line, but you do need to know how your code works.

In the future, the worry isn't whether AI tools will get cheaper per token, it's whether your codebase will demand more tokens than you can afford. Legacy software modernisation has always been about addressing accumulated neglect before it becomes a crisis. The only thing that's changed is how quickly the neglect accumulates now.

Frequently asked questions

Why are AI coding costs rising when token prices are falling?

Because messy codebases force AI agents to read more files, retry failed attempts, and re-send longer conversation histories. The effective cost per useful task goes up even as the per-unit token price drops.

Are AI token prices subsidised?

Yes. Major AI providers are currently pricing below cost to drive adoption. OpenAI projects $14 billion in losses for 2026 alone. When pricing corrects, teams with inefficient codebases will feel the impact most.

How do you prevent AI-generated code from becoming legacy code?

Maintain the same quality standards you would for human-written code. Use AI to accelerate development, but make sure your team understands the code it produces. Clean codebases consume fewer tokens and produce better AI outputs.