Sakana AI unveils Sakana Fugu, showcasing Fugu Ultra’s advanced coding, reasoning, and testing capabilities in a futuristic t

Editorial illustration for Sakana AI launches Sakana Fugu; Fugu Ultra leads coding, reasoning and tests

Sakana AI launches Sakana Fugu; Fugu Ultra leads coding,...

Sakana AI launches Sakana Fugu; Fugu Ultra leads coding, reasoning and tests

By AI Daily Post Edited by Brian Petersen, Editor-in-Chief

June 22, 2026 • 2 min read

Sakana AI rolled out its newest offering, Sakana Fugu, today. The service looks like a single OpenAI‑compatible endpoint, but under the hood it’s a multi‑agent orchestration system that decides how to tackle each request. If a task can be solved directly, Fugu handles it itself; when the problem calls for more expertise, it pulls together a team of specialist models and coordinates their output.

What’s notable is that Fugu isn’t just a router—it’s a language model trained to call other LLMs, even spawning instances of itself recursively. It manages model selection, delegation, verification and synthesis without any hard‑coded roles or workflows, learning on the fly when to delegate and how agents should communicate.

The company frames the architecture as a safeguard against single‑vendor lock‑in. Recent export controls on Anthropic’s Fable and Mythos models, for example, prompted the team to build a system that can reroute around provider restrictions. As newer models emerge, they can be slipped into the pool, keeping the service adaptable without exposing the routing logic to users.

Fugu Ultra tops the four coding benchmarks, CharXiv Reasoning, and Humanity’s Last Exam. Regular Fugu leads SciCode, τ³ Banking, and Long Context Reasoning. GPT 5.5 wins MRCRv2, the only baseline win here.

Its Fugu models stand shoulder-to-shoulder with Anthropic’s Fable 5 and Mythos Preview. Those two are not in Fugu’s pool, since they are not publicly accessible.

Use Cases

Sakana AI ran a beta with close to 500 early users. The published examples favor long, multi-step tasks.

AutoResearch: An agent improved a small GPT’s training recipe autonomously.

Sakana AI Launches Sakana Fugu: An Orchestration Model That Routes Tasks Across a Swappable Pool of Frontier LLMs - MarkTechPost

Why this matters

Sakana AI’s Fugu system promises to hide the intricacies of multi‑agent orchestration behind a single endpoint, letting developers submit a request and let the platform decide whether a solo model or a coordinated team is needed. The claim that “the complexity of a multi‑agent system never reaches your code” is appealing, yet it leaves open how much control developers retain when a team of models is assembled. Fugu Ultra’s top scores on four coding benchmarks, CharXiv Reasoning, and Humanity’s Last Exam suggest strong performance in isolated tests, while regular Fugu leads in SciCode, τ³ Banking, and Long Context Reasoning.

GPT 5.5’s lone win on MRCRv2 shows competition is still viable. What remains unclear is how these results translate to production workloads, integration overhead, or cost. For founders, the promise of a plug‑and‑play orchestration layer could reduce engineering effort, but the trade‑offs in latency and predictability need careful evaluation.

Researchers may find a useful testbed for model collaboration, yet the lack of detail on the “swappable pool” of frontier LLMs invites skepticism about reproducibility and long‑term support.

Sakana AI launches Sakana Fugu; Fugu Ultra leads coding,...

Use Cases

Further Reading

Latest News

xAI adds /goal to Grok Build for autonomous multi-step coding with verification

Alibaba AI video model climbs to #2 as Sora withdrawal warns firms