Poolside AI launches Laguna XS.2 and M.1, hitting 72.5%...

Poolside AI’s latest rollout adds two agentic coding models—Laguna XS.2 and M.1—to its growing portfolio. The company describes Laguna XS.2 as its second‑generation mixture‑of‑experts (MoE) system, while M.1 builds on the same architecture with a focus on autonomous code generation. Both are positioned as “agentic” because they can initiate and manage coding tasks without step‑by‑step prompting.

That claim matters most when the models are pitted against established benchmarks that simulate real‑world software development challenges. SWE‑bench, for instance, offers a suite of tests ranging from verified solutions to multilingual and professional‑level tasks, while Terminal‑Bench 2.0 evaluates broader system interaction. Seeing how Laguna XS.2 and M.1 perform across these metrics gives developers a concrete sense of their practical capabilities—and limits.

Here’s how the numbers stack up:

Laguna M.1 and Laguna XS.2 mark Poolside AI’s first public foray into its Laguna line, and the company has paired them with “pool,” a lightweight terminal‑based coding agent and dual ACP client‑server that mirrors the internal RL training environment. On SWE‑bench Verified the models hit 72.5 %, with 67.3 % on the multilingual variant, 46.9 % on SWE‑bench Pro and 40.7 % on Terminal‑Bench 2.0. Those numbers suggest progress, yet it is unclear how they translate to broader software‑development tasks outside the benchmark suites.

The release of “pool” as a research preview may invite external validation, but adoption will depend on how easily researchers can integrate the dual‑client protocol into existing workflows. Laguna XS.2 is described as a second‑generation mixture‑of‑experts model, though details about its architecture remain sparse. Without independent replication of the reported scores, the practical impact of these agents remains uncertain.

For now, the announcement provides concrete metrics and tooling, leaving the community to assess whether the performance gains hold up under real‑world coding pressures.

Poolside AI launches Laguna XS.2 and M.1, hitting 72.5%...

Further Reading

Latest News

Qiushi Discovery Engine Enables Autonomous Science on Optical Platform

OpenAI activates default marketing cookies for free ChatGPT users

New pipeline merges video analysis, object tracking, dynamic panning to fix dataset limits

Musk says he was duped, warns AI could kill us, xAI to IPO via SpaceX in June

GPT-5.5 scores 71.4% on expert cybersecurity tasks, edging Mythos Preview's 68.6%

Musk loses bid to hide xAI safety record, credibility questioned on OpenAI stand

New Architecture Separates Execution and Review Agents for Tool-Calling

Eight tech giants sign Pentagon AI contracts; Anthropic warns of legal loopholes

Microsoft adds AI legal agent to Word to flag contract risks and suggest edits

Pentagon signs AI contracts with Nvidia, Microsoft, AWS after Anthropic dispute

Further Reading

Related Reading

DeepMind spinoff’s AI‑designed drugs enter human trials after AlphaFold 3

Hyperparameter Tuning Reaches 0.9617 Accuracy in 64.59 Seconds

Google AI Advisors Let Users Probe Performance with Conversational “Why” Queries

DeepSeek unveils new AI breakthrough as nation tightens grip on departing firms

Google DeepMind AI co‑clinician beats GPT‑5.4 in blind tests, lags docs