Editorial illustration for OpenAI API token usage rises from 6 bn to 15 bn per minute, straining compute
OpenAI API Token Usage Surges to 15 Billion per Minute
OpenAI API token usage rises from 6 bn to 15 bn per minute, straining compute
The surge in demand for OpenAI’s services is hitting the back‑end hard. Over the past half‑year, the volume of tokens processed by the company’s API has more than doubled, pushing the infrastructure to its limits. Engineers report frequent outages, while the firm has begun rationing access to keep key customers online.
At the same time, the market for graphics processing units—essential for training and inference—has tightened, driving prices upward and squeezing margins. Inside the organization, senior leaders are forced to prioritize short‑term capacity, often juggling trade‑offs that would have been unthinkable a few months ago. It’s a pressure cooker scenario: relentless growth meets a finite pool of compute resources, and every additional request adds strain.
This context frames what OpenAI’s chief financial officer, Sarah Friar, told the Wall Street Journal about her day‑to‑day focus and the tough choices the company now faces.
Token usage across OpenAI's API jumped from 6 billion per minute in October to 15 billion per minute by the end of March, according to the WSJ. OpenAI CFO Sarah Friar told the WSJ that she spends much of her time hunting for near-term compute capacity and that the company is making difficult decisions about which projects to shelve because resources simply aren't available. Providers have been rolling out new limits since January to manage the agent boom The capacity crisis is also reshaping plans for developer tools, which increasingly run agentic workloads that consume far more tokens.
Is the AI sector hitting a hard ceiling? Token consumption on OpenAI’s API surged from six billion per minute in October to fifteen billion by March, a spike that the Wall Street Journal says is straining the available compute pool. Outages are now common.
And enterprises are feeling the pinch as GPU prices climb and providers resort to rationing. While Anthropic reports an API availability of 98.95 percent, well below the 99.99 percent benchmark, it is already losing enterprise customers to OpenAI. OpenAI is cutting Sora.
The company shut down its video‑generation app Sora to reallocate GPU cycles toward coding tools and its enterprise professional tier, a move CFO Sarah Friar said is necessary while she hunts for near‑term compute capacity. Can the sector secure enough silicon to keep pace? Unclear whether demand will subside.
Until providers can expand capacity or find sustainable pricing for GPUs, enterprises may continue to face throttled services, and the current compute crunch could shape short‑term product strategies across the industry.
Further Reading
- OpenAI's API Hits 15 Billion Tokens Per Minute as GPT-5.4 - Business Analytics Substack
- OpenAI’s APIs processed 6 billion tokens per minute in October 2025. By April this year, that had risen to 15 billion - Exponential View Substack
- Accelerating the next phase of AI - OpenAI
- We are growing revenue four times faster than the companies who defined the internet and mobile eras, OpenAI says it's making $2 billion a month — mostly from enterprise users - TechRadar
Common Questions Answered
How much has OpenAI's API token usage increased between October and March?
OpenAI's API token usage surged from 6 billion tokens per minute in October to 15 billion tokens per minute by the end of March. This dramatic increase represents more than a 150% growth in just five months, putting significant strain on the company's computational infrastructure.
What challenges is OpenAI facing due to the massive increase in token usage?
OpenAI is experiencing frequent infrastructure outages and is being forced to ration access to its services to keep key customers online. The company's CFO, Sarah Friar, is spending considerable time searching for near-term compute capacity and making difficult decisions about which projects to postpone due to resource constraints.
How is the current GPU market affecting OpenAI's operations?
The graphics processing unit (GPU) market has tightened significantly, driving prices upward and squeezing profit margins for AI companies. This scarcity of computational resources is forcing providers like OpenAI to implement new limits and carefully manage their available compute capacity.