2 days ago

From Tokens to Outcomes: Rethinking AI Tool Pricing Models in Enterprise SaaS

The enterprise AI billing conversation has shifted. What began as a race to maximize AI adoption at any cost is now colliding with the hard economics of production-scale deployment. Token consumption, once treated as a proxy for innovation, is increasingly being recognized for what it often is: an imprecise and expensive measure of actual business value.

A new pricing logic is emerging — one that charges for results, not compute.

6 mins read

10 sections

3 visuals

Key Highlights

Enterprise AI is moving from tokenmaxxing to outcome-based pricing tied to real business results
Separating design-time reasoning from deterministic runtime is key to predictable AI cost curves
AI ops, small language models, and strong governance now define competitive AI tool selection

The Tokenmaxxing Era and Its Costs

For the past two years, many enterprises ran internal campaigns to drive AI adoption, sometimes incentivizing employees through leaderboard rankings and usage targets. The implicit assumption was that more AI usage would naturally translate into better outcomes. Analysts now describe this period as “tokenmaxxing” — a no-holds-barred experimentation phase driven largely by fear of falling behind.

The consequences are becoming visible on balance sheets. Astronomical compute bills, unpredictable cost curves, and underwhelming ROI have forced a reckoning. As Liz Miller, VP and Principal Analyst at Constellation Research, put it plainly: tokenmaxxing is “probably one of the most detrimental things to actually finding success in AI that we could have gone through.”

The core problem was structural. Enterprises were deploying frontier models — the most capable and most expensive — indiscriminately across tasks that could have been handled by far simpler tools. The incentive structure rewarded volume, not precision.

Outcome-Based Pricing: The Model Taking Shape

Against this backdrop, a wave of SaaS vendors is rethinking how they charge for AI. The direction is clear: away from per-seat subscriptions and raw token consumption, toward pricing tied to measurable outcomes.

Pegasystems is among the most prominent examples. Rather than billing for the compute consumed during agentic workflows, Pega is moving toward charging per resolved case — a model that directly aligns vendor revenue with customer value. Zendesk and Intercom have introduced similar outcome-based pricing options. Intuit’s CTO has publicly signaled the company is exploring alternatives to traditional subscription models.

This is not merely a pricing experiment. It reflects a deeper shift in how enterprise buyers are evaluating AI tools — and how vendors must now justify their value propositions.

The Architecture Behind the Economics

What makes outcome-based pricing viable is not just commercial intent — it requires a fundamentally different technical architecture. Pega’s approach offers a useful model for understanding this.

Reasoning at Design Time, Not Runtime

Pega CTO Don Schuerman describes a deliberate separation between where reasoning happens and where execution happens. Heavy reasoning is concentrated in the design phase, where agents develop workflows that humans then evaluate and approve. At runtime, lighter and cheaper models follow a deterministic script.

“What I don’t want the agent doing is re-reasoning the whole workflow, which is where the token costs actually get really expensive,” Schuerman explained. “I want to use most of the reasoning at design time, and I want the runtime to be deterministic when it can be.”

This architecture makes costs both more predictable and more defensible — two properties that enterprise procurement teams increasingly demand.

The Role of Small Language Models

Frontier models are not always the right tool. PayPal’s SVP and Global Head of AI, Prakhar Mehrotra, articulated the emerging calculus clearly: fine-tuned open-source models, applied to specific tasks within a broader agentic flow, can dramatically reduce token consumption without sacrificing performance where it matters.

“You’re not consuming the full tokens of a frontier model,” Mehrotra noted. “That’s a trade-off, and it’s problem-specific, but that’s what you’re trying to optimize.”

Small language models and specialized open-weights models are becoming a serious component of enterprise AI architecture — not as replacements for frontier models, but as precision instruments deployed where they are sufficient.

AI Ops as a Discipline

The shift toward cost-conscious AI deployment is also creating demand for a new operational skill set. Miller describes this as “AI ops” — the capability to decide where, when, and how different AI models and functions are applied within a given workflow.

This includes knowing when to reach for a traditional deterministic machine learning algorithm rather than a generative model. Many enterprises already have years of investment in classical ML pipelines. Layering agentic workflows on top of those existing systems — rather than replacing them wholesale — is emerging as a more pragmatic and cost-efficient architecture.

The implication for tool selection is significant. Enterprises evaluating AI platforms will increasingly need to assess not just raw capability, but composability: how well a tool integrates into a hybrid stack of frontier models, small models, and classical ML.

Governance: The Concentrated Risk

Outcome-based pricing and design-phase reasoning are not without risk. Keith Kirkpatrick, VP and Research Director at Futurum Group, identifies a structural vulnerability in Pega’s approach: by concentrating reasoning in the design phase, the model places enormous pressure on the quality of upfront configuration and governance.

“Enterprises will scrutinize whether Pega can deliver consistent agent performance across complex, regulated workflows without introducing new bottlenecks or governance gaps,” Kirkpatrick noted in a recent research analysis.

If the design phase is well-governed, the model can deliver reliability and cost predictability at scale. If governance falters, the deterministic runtime becomes a liability rather than an asset. For enterprises in regulated industries — financial services, healthcare, insurance — this is not a theoretical concern.

What This Means for AI Tool Selection

Alignment of incentives matters. Outcome-based pricing forces vendors to stand behind the results their tools deliver. When evaluating platforms, buyers should ask whether the vendor’s commercial model is aligned with actual business outcomes — or simply with consumption.

Architecture determines cost trajectory. Tools that separate reasoning from execution, and that support hybrid model stacks, will be better positioned to deliver predictable costs at scale. This is now a legitimate evaluation criterion alongside capability benchmarks.

Governance is non-negotiable. As reasoning shifts upstream into design and configuration phases, the governance frameworks around those phases become critical. Enterprises should assess vendor maturity in this area with the same rigor applied to security and compliance.

Small models are a strategic asset. The ability to fine-tune and deploy smaller open-weights models for specific tasks is no longer a niche capability. It is becoming a core component of cost-optimized enterprise AI architecture.

The Broader Market Signal

Pega’s stock is down 46% year to date — a figure that reflects broader market anxiety about SaaS vendors whose value propositions are being disrupted by the very AI capabilities they are trying to monetize. The pressure is structural, not cyclical.

Vendors that can credibly demonstrate outcome delivery, cost predictability, and governance maturity will be better positioned to survive this transition. Those that continue to sell token consumption as a proxy for value will face increasing resistance from enterprise buyers who have learned, often expensively, that the two are not the same thing.

Key Highlights

The Tokenmaxxing Era and Its Costs

Outcome-Based Pricing: The Model Taking Shape

The Architecture Behind the Economics

Reasoning at Design Time, Not Runtime

The Role of Small Language Models

AI Ops as a Discipline

Governance: The Concentrated Risk

What This Means for AI Tool Selection

The Broader Market Signal

Related · Content

Anthropic Targets Australia and Japan for Multi-Hundred-Megawatt AI Data Centers

Accenture Leak Reveals the Real AI Cost Center: Token Chewing by Business Users, Not Engineers

Qualcomm Acquires Modular: What a $4B AI Infrastructure Deal Means for Data Center Inference

Hostile vs. Supportive AI: What a 58‑Person Lab Study Reveals About Stress, Friction, and Output Quality

Comments (0) No comments yet

Related · Tools

Empromptu

TrojAI

DocsGPT

Aissist

Super Amplify

Snowflake Cortex AI