Skip to main content
AT&T data center with server racks, glowing blue lights, and network cables, symbolizing 90% AI cost reduction and 8 billion

Editorial illustration for AT&T cuts AI orchestration costs 90% after handling 8 B tokens daily

AI Agent Routing Slashes Enterprise Token Costs 90%

AT&T cuts AI orchestration costs 90% after handling 8 B tokens daily

2 min read

AT&T’s internal AI platform was swallowing roughly eight billion tokens each day, a volume that quickly exposed inefficiencies in the company’s orchestration layer. Faced with soaring compute bills, the telecom giant embarked on a systematic overhaul, hunting for a framework that could scale without draining resources. The effort wasn’t just about slashing spend; it required a modular approach that let engineers test new models, swap out services, and retire underperforming pieces without disrupting downstream applications.

To that end, AT&T instituted a “really rigorous” vetting process, pitting third‑party solutions against home‑grown tools. One standout was its Ask Data offering, built on a Relational Knowledge Graph that recently claimed the top spot on the Spider 2.0 text‑to‑SQL accuracy leaderboard. Other components, still under evaluation, have shown promising early results.

The outcome? A re‑engineered pipeline that now runs at a fraction of its former cost—up to a 90 percent reduction—while maintaining the flexibility needed for rapid experimentation. As the team puts it, “We need to be able to pilot, plug in and plug out different components.”

"We need to be able to pilot, plug in and plug out different components." They do "really rigorous" evaluations of available options as well as their own; for instance, their Ask Data with Relational Knowledge Graph has topped the Spider 2.0 text to SQL accuracy leaderboard, and other tools have scored highly on the BERT SQL benchmark. In the case of homegrown agentic tools, his team uses LangChain as a core framework, fine-tunes models with standard retrieval-augmented generation (RAG) and other in-house algorithms, and partners closely with Microsoft, using the tech giant's search functionality for their vector store. Ultimately, though, it's important not to just fuse agentic AI or other advanced tools into everything for the sake of it, Markus advised. "Sometimes I've seen a solution over engineered." Instead, builders should ask themselves whether a given tool actually needs to be agentic.

Cutting orchestration spend by ninety percent is a striking figure, especially when it follows a daily flow of eight billion tokens. AT&T’s chief data officer Andy Markus says the breakthrough came from abandoning a monolithic model pipeline in favor of a LangChain‑based multi‑agent stack, where large “super agents” delegate work to smaller components. The new layer lets the team pilot, plug in and plug out modules as needed, a flexibility they stress is essential at scale.

Rigorous benchmarking backs the design: their Ask Data tool, built on a relational knowledge graph, currently leads the Spider 2.0 text‑to‑SQL leaderboard. Yet the article stops short of explaining how the cost model was calculated or whether the savings are sustainable as token volumes grow. Moreover, it is unclear if the same architecture would deliver comparable gains for firms with different data footprints or regulatory constraints.

Can this model scale beyond AT&T? The results suggest a viable path for large operators, but broader applicability remains an open question. As AT&T continues to refine its orchestration, observers will watch for evidence that the approach can maintain performance without sacrificing accuracy.

Further Reading

Common Questions Answered

How did AT&T reduce its AI orchestration costs by 90%?

AT&T transitioned from a monolithic model pipeline to a LangChain-based multi-agent stack that allows for modular component management. The new approach enables engineers to pilot, plug in, and remove different AI components easily, dramatically reducing computational overhead and increasing flexibility in their AI infrastructure.

What volume of tokens was AT&T processing daily before their AI orchestration overhaul?

AT&T was handling approximately eight billion tokens each day, which exposed significant inefficiencies in their original AI orchestration layer. This massive token volume drove the company to seek a more scalable and cost-effective framework for managing their AI computational resources.

What key strategy did AT&T's chief data officer Andy Markus implement to improve AI orchestration?

Andy Markus implemented a LangChain-based multi-agent architecture where large "super agents" can delegate work to smaller components. This approach provides unprecedented flexibility, allowing the team to rapidly test, swap, and retire different AI services without disrupting the entire system.