AI token costs surge as companies scramble for budget control

Companies across the tech industry are hitting hard budget limits on artificial intelligence spending. Uber consumed its entire 2026 AI coding budget by April. Microsoft revoked Claude Code licenses it had enabled for developers months earlier. A Priceline employee told TechCrunch that a routine contract renewal for Cursor came back four to five times more expensive than before.

Even though the price per token has fallen, the push for more AI adoption and the rise of autonomous agents have driven total token usage higher and higher. Companies that loaded up on all-you-can-eat subscriptions in early 2025 are now trying to figure out where their money went, pull back spending, and decide whether they can get any return on investment from the budget blowout.

A market is forming to address the problem. Startups, established vendors, and a new standards body are all racing to give companies tools and common language to track what they spend.

From what can AI do to how much does it cost

"Six months ago, I would have a conversation with a customer and it would be all about 'What can it do? Is it good enough?'" Alexander Embricos, OpenAI's head of enterprise, told TechCrunch at an event in New York City this week. "Our conversations are never about that now. Now the conversations are about, 'hey, we're spending so much. What visibility do you have? What auditability do you have? What token controls do you have? What is the efficiency of your models?'"

Against this backdrop, the Linux Foundation this week unveiled plans for the Tokenomics Foundation, a new standards body that aims to bring the same cost discipline to AI tokens that FinOps brought to cloud spending.

"In April and May, I started hearing from companies: 'Oh my god, we are 3x over our entire 2026 token budget and it's only April,'" J.R. Storment, executive director of the FinOps Foundation, a project under the Linux Foundation, told TechCrunch. "We started hearing existential crises, and the whole conversation shifted from tokenmaxxing and 'go fast' to 'we need guardrails, how do we control this?'"

The complaints followed intense demands from CEOs pushing their teams to use the best models and move quickly, ignoring costs. New models released in November, such as Anthropic's Claude Opus 4.5, OpenAI's GPT-5.1, and Google's Gemini 3 Pro, brought major improvements to agentic tools, which have multiplied consumption. One company reportedly ended up with a $500 million Claude bill after forgetting to set usage limits for employees.

Measuring the explosion in token use

"It's like the crack-cocaine epidemic," said Chris Reed, senior director of IT finance at Priceline, when asked about the pricing issue. "They let you try it to get you hooked on it, and now you're kind of beholden to it."

Vitaly Gordon, CEO of engineering operations platform Faros AI, said he recently spoke to a CTO who told him: "One of my engineers spent $40,000 on tokens last month, and I genuinely don't know whether I should stop him or should I go and tell everyone else to be like him."

A March survey by Faros found that among 20,000 developers, output was rising but so were bugs and rewrites. Jellyfish, an engineering management platform, similarly found that engineers who used the most tokens were about twice as productive as those who used AI less, but they spent 10 times the number of tokens to get there.

Nicholas Arcolano, head of research at Jellyfish, told TechCrunch via email that AI expenditure is exploding largely due to agentic features, with per-developer consumption rising about 18.6 times in nine months. These numbers make the productivity case murkier than the spending suggests.

"Whether extreme spend pays off comes down to the ultimate business value of shipped code (e.g. revenue), which most companies still can't measure," Arcolano said.

Part of the measurement problem is the sheer scale of AI usage today. "Tracking cloud costs is a hundreds-of-millions-of-rows-a-month data problem," Storment said. "Tracking token costs is a trillions-of-rows-a-month data problem. You can't just stick that into whatever spreadsheet or even basic tool. You've got to fundamentally rethink your tooling, your specs and your accounting systems to do that."

The #1 Newsletter in AI

Stay ahead of the AI curve

The most important updates, news, and content — delivered weekly.

No spam. Unsubscribe anytime.

At Priceline, Reed is already seeing discrepancies between a vendor's reported usage and internal data. "I started my career in telecom expense management, and I'm seeing all the same parallels, from telecom to cloud to AI," he said. "Anytime you introduce something new, it's ripe for billing errors and audit and optimization opportunities."

The emerging market for token management

A market is beginning to form around this problem. Pure-play companies include Pay-i, which tracks, measures, and optimizes the costs and performance of GenAI investments. Paid lets developers track costs, measure usage, and bill users based on actual value rather than subscription fees.

Companies like Jellyfish, Waydev, and Faros AI provide AI agent monitoring to prove the ROI of developer tools. Storment says most of the 180 vendors within the FinOps Foundation are leaning into this space.

Existing vendors are also adding features. Ramp has moved into AI spend management. Datadog and New Relic have tacked on services like cloud cost management, token-level observability, and GPU monitoring. At the FinOps X conference next week, AWS is expected to introduce new financial management features for enterprise AI spending.

Tiffany Luck, a partner at NEA, thinks token efficiency and observability will likely be added at the harness or app layer. She pointed to Factory, a startup making AI agents for enterprises, which this week launched a model router that automatically picks the right model for each task.

Gordon expects frontier labs and other model providers to adopt OpenRouter-style optimization to drive queries to the cheapest models, a trend already showing up on enterprise Claude bills. "The financial report for how much you spend on Anthropic, even if you call the Opus model, some of the spend will be on Sonnet or Haiku, because they are smart enough to do it," Gordon said. "I think this will become more and more of a thing."

A common language for token economics

All these tools are being built without a common language or shared definitions for how much a token costs, what it produces, and how to compare spend across vendors. That is where the Tokenomics Foundation hopes to prove useful.

The Foundation is building a canonical definition and framework for tokenomics, open standards, specifications and metrics for AI token usage and billing, as well as new metrics for AI economics like cost-per-intelligence or tokens-per-watt. It also plans to define metrics across token factory effectiveness and consumption efficiency. The group plans a formal launch in July and will announce more members at the FinOps X conference next week.

"Token economics is fundamentally more abstract and opaque than anything we've managed at this scale before," Nishant Gupta, chief availability officer at Salesforce, said in a statement. "It requires a different operational muscle than the one the industry built for cloud."

Goldman Sachs projects global token usage will multiply by 24 times by 2030. Companies already over budget need solutions now, but the Foundation's first deliverable is still months away.

"Maybe we created a steam engine, but we still haven't figured out the assembly line," said Gordon.

According to Arcolano, the smart move is broad, moderate adoption. "The best ROI comes from moving the broad middle from low to moderate usage, not pushing heavy users higher," he said.

Related on Neura Market

AI Tools Directory, Explore cost monitoring and optimization tools for AI token usage.
Automation Marketplace, Find agentic AI platforms and workflow solutions.
Cloud Cost Management, Compare vendors for FinOps and cloud spend tracking.

ai tokens tokenomics linux foundation enterprise ai spending agentic ai

AI token costs surge as companies scramble for budget control

From what can AI do to how much does it cost

Measuring the explosion in token use

Stay ahead of the AI curve

The emerging market for token management

A common language for token economics

Related on Neura Market

More from Neura News

Google Unveils Gemini 3.6 Flash, 3.5 Flash-Lite, and Cyber Model

Alibaba Qwen-Image-3.0 renders infographics and tiny text in one pass

Google Unveils Gemini 3.6 Flash, 3.5 Flash-Lite, and Cyber Model

NVIDIA Vera Rubin Boosts Performance Per Watt, Cuts Token Costs