AI content generation is temporarily unavailable. Please check back later.
Research & Benchmarks

DeepMind AI agent explores new games, explains its actions better than SIMA 1

3 min read

Why does an AI that can talk about its own moves matter? While most agents treat a game like a puzzle to solve, DeepMind’s latest system actually wanders into unfamiliar titles and even worlds it builds itself, learning by trial rather than by pre‑programmed rules. The researchers let the model pick up a new environment, then asked it to narrate what it was trying to achieve, step by step.

But the real test came when they followed up with extra questions—something earlier bots struggled with. Here’s the thing: the new agent doesn’t just spew a list of actions; it attempts to justify each choice, offering a glimpse into its reasoning process. Compared with DeepMind’s previous interface, SIMA 1, the dialogue feels less like issuing commands and more like collaborating with a teammate.

That shift in tone is what the team highlights as a step toward more natural human‑AI interaction.

According to Deepmind, the system can explain its intentions, describe intermediate steps, and respond to follow‑up questions—not perfectly, but much more effectively than SIMA 1. The result is a more cooperative and natural interaction that feels less like issuing commands and more like working w

According to Deepmind, the system can explain its intentions, describe intermediate steps, and respond to follow-up questions - not perfectly, but much more effectively than SIMA 1. The result is a more cooperative and natural interaction that feels less like issuing commands and more like working with a digital partner. How SIMA 2 performs in unfamiliar games A key goal for SIMA 2 is solving tasks in games it has never encountered before.

In tests using the Minecraft-based MineDojo and the recently released game ASKA, SIMA 2 achieved significantly higher success rates than its predecessor. While SIMA 1 struggled with most tasks, SIMA 2 completed 45 to 75 percent in these new games, compared to SIMA 1's 15 to 30 percent. The system can also generalize abstract concepts - for example, taking what it learned as "harvesting" in one game and applying it as "mining" in another.

This level of transfer learning is key for AI systems meant to adapt to new and unfamiliar conditions. SIMA 2 processes multimodal inputs - such as speech, images, and emojis - and can handle more complex, multi-step instructions. The improved architecture also enables longer, real-time interactions at higher resolutions than before.

Learning through experimentation, not human data One of the biggest upgrades is SIMA 2's ability to improve itself. It can learn new tasks through trial and error without relying on human training data.

Related Topics: #DeepMind #AI agent #SIMA 1 #SIMA 2 #MineDojo #ASKA #Minecraft #natural interaction

Can an AI truly explore a game without human hints? SIMA 2 attempts to do just that, navigating unfamiliar 3D worlds while forming its own plans. Built on DeepMind’s Gemini integration, the system moves beyond the voice‑command limits of SIMA 1, reasoning about tasks and choosing actions autonomously.

It also tries to verbalize its intent, breaking down steps and answering follow‑up queries—though the explanations are not flawless. The agent’s capacity to transfer what it learns to new games is claimed, yet concrete evidence of such generalisation remains unclear. Compared with its predecessor, the interaction feels less like issuing commands and more like a collaborative dialogue, a shift noted by DeepMind.

Still, the degree to which the model can consistently improve without any human input doesn't have quantified results yet. The similarity to Nvidia’s Voyage approach suggests a broader trend, but whether SIMA 2’s reasoning holds up across diverse virtual environments is still an open question. Ultimately, the report shows progress, tempered by acknowledged limitations.

Further Reading

Common Questions Answered

How does SIMA 2’s ability to explain its intentions differ from SIMA 1?

SIMA 2 can narrate its goals, describe intermediate steps, and answer follow‑up questions, whereas SIMA 1 was limited to simple voice commands. Although its explanations are not flawless, they are considerably more detailed and cooperative than those of SIMA 1.

What role does DeepMind’s Gemini integration play in SIMA 2’s performance?

The Gemini integration provides the underlying reasoning engine that lets SIMA 2 plan autonomously and verbalize its intent. This architecture moves the agent beyond the command‑only interface of SIMA 1, enabling richer interaction and self‑directed exploration.

In what way was SIMA 2 tested on unfamiliar games, and what environment was used?

Researchers evaluated SIMA 2 by placing it in games it had never seen before, using a Minecraft‑based environment as a benchmark. The agent had to navigate the 3D world, formulate its own plans, and explain its actions without any human hints.

Can SIMA 2 transfer knowledge learned in one game to other new games?

Yes, the article notes that SIMA 2 demonstrates a capacity to transfer what it learns to new games, suggesting generalization beyond a single title. This ability is a key advantage over earlier agents that relied on pre‑programmed rules for each specific game.