The Cracks Are Showing Up in the Budget Line

Microsoft encouraged thousands of its engineers, designers, and project managers to experiment with Claude Code. Six months later, it’s canceling most of those licenses and steering developers back toward GitHub Copilot CLI. The tool worked. People loved it. That was the problem.
Uber’s story is even more pointed. The company built internal leaderboards to gamify AI adoption — ranking teams by how much they used coding tools. It worked so well that Uber burned through its entire 2026 AI coding budget by April. Four months into the year. Budget: gone.
These aren’t cautionary tales about AI failing. They’re cautionary tales about AI succeeding at exactly the wrong price point.
The Paradox Nobody Budgeted For

Here’s the uncomfortable math sitting at the center of enterprise AI right now.
Token prices are falling. Gartner projects inference costs on frontier models will drop nearly 90% by 2030. That sounds like a win — until you factor in that agentic AI workflows consume dramatically more tokens per task than simple prompt-response interactions. Goldman Sachs forecasts a 24-fold increase in token consumption by 2030, potentially reaching 120 quadrillion tokens per month.
Cheaper per unit. Vastly more units. Net result: bigger bills.
Bryan Catanzaro, VP of Applied Deep Learning at Nvidia, put it plainly: “For my team, the cost of compute is far beyond the costs of the employees.” That’s not a fringe opinion. That’s a senior leader at the company selling the GPUs saying the quiet part out loud.
Agentic AI Is the Multiplier Nobody Priced In

Standard AI usage — a developer asking Copilot to autocomplete a function — is relatively contained. Agentic AI is a different creature entirely.
Agents don’t answer one question. They plan, iterate, call tools, retry failures, and chain tasks across long sessions. Each step burns tokens. Each retry burns more. Jensen Huang envisions 100 AI agents working alongside every human employee at Nvidia. That’s a compelling vision. It’s also a token consumption number that would make most CFOs quietly leave the room.
Gartner’s Will Sommer framed the risk cleanly: don’t confuse the deflation of commodity tokens with the democratization of frontier reasoning. The cheap tokens and the useful tokens are increasingly not the same tokens.
What’s Actually Happening in the Market

A few patterns are crystallizing fast.
Consolidation over experimentation. Microsoft’s move from Claude Code to Copilot CLI isn’t abandoning AI — it’s rationalizing it. Expect more enterprises to consolidate around fewer, more controlled toolchains rather than letting every team run their own stack.
Budget governance is becoming a product feature. Usage caps, spend dashboards, and token budgets are no longer nice-to-haves. They’re table stakes. Tools that don’t surface cost visibility will lose procurement conversations.
The leaderboard era is ending. Incentivizing raw consumption — “toxenmaxx,” Claudeonomics, usage rankings — was a reasonable way to drive adoption. It’s a terrible way to drive ROI. The metric is shifting from how much AI are we using to what are we actually getting for it.
What This Means If You’re Choosing AI Tools Right Now

The enterprise cost trap isn’t a reason to avoid AI tooling. It’s a reason to choose more carefully.
For founders and product teams: Cost transparency is a competitive differentiator. If your tool helps teams understand what they’re spending and why, that’s a feature worth leading with.
For AI adopters: Before scaling any agentic workflow, model the token consumption at 10x usage. The unit economics that look fine at pilot scale often look alarming at production scale.
For anyone evaluating tools: Ask vendors directly how costs scale with agentic use cases. If they can’t answer clearly, that’s your answer.
The Honest Takeaway
The AI productivity revolution is real. So is the bill.
The companies hitting these walls — Microsoft, Uber — aren’t AI skeptics. They’re among the most aggressive AI adopters on the planet. The lesson isn’t that AI doesn’t work. It’s that the economics of scaling AI are fundamentally different from the economics of piloting it.
Compute costs more than coders right now, at least at scale. That gap may narrow. But it won’t close on its own — and it definitely won’t close by accident.
Observe the costs. Choose the tools that help you control them.
Comments (0) No comments yet
Want to join this discussion? Login or Register.
No comments yet. Be the first to share your thoughts!