Skip to main content
Researcher watches a monitor displaying a new video‑game level while DeepMind’s AI diagram glows beside the code.

DeepMind AI agent explores new games, explains its actions better than SIMA 1

3 min read

When DeepMind let its newest agent wander into games it hadn’t seen before - even ones it builds on the fly - I was surprised it didn’t just follow a script. The model was dropped into a brand-new environment and asked to talk through what it was trying to do, step by step. It sounds a bit like watching someone think out loud, except the “someone” is a neural net.

The real twist came when the researchers tossed follow-up questions at it - something older bots usually fumbled over. Instead of rattling off a list of moves, the agent tried to justify each choice, giving a rough peek at its reasoning.

Compared with DeepMind’s earlier interface, SIMA 1, the conversation feels less like issuing commands and more like brainstorming with a teammate. According to the team, the system can explain its intentions, break down intermediate steps and answer follow-ups - not perfectly, but noticeably better than SIMA 1. The upshot is a chat that feels more cooperative, as if you’re working together rather than just giving orders.

According to Deepmind, the system can explain its intentions, describe intermediate steps, and respond to follow-up questions - not perfectly, but much more effectively than SIMA 1. The result is a more cooperative and natural interaction that feels less like issuing commands and more like working with a digital partner. How SIMA 2 performs in unfamiliar games A key goal for SIMA 2 is solving tasks in games it has never encountered before.

In tests using the Minecraft-based MineDojo and the recently released game ASKA, SIMA 2 achieved significantly higher success rates than its predecessor. While SIMA 1 struggled with most tasks, SIMA 2 completed 45 to 75 percent in these new games, compared to SIMA 1's 15 to 30 percent. The system can also generalize abstract concepts - for example, taking what it learned as "harvesting" in one game and applying it as "mining" in another.

This level of transfer learning is key for AI systems meant to adapt to new and unfamiliar conditions. SIMA 2 processes multimodal inputs - such as speech, images, and emojis - and can handle more complex, multi-step instructions. The improved architecture also enables longer, real-time interactions at higher resolutions than before.

Learning through experimentation, not human data One of the biggest upgrades is SIMA 2's ability to improve itself. It can learn new tasks through trial and error without relying on human training data.

Related Topics: #DeepMind #AI agent #SIMA 1 #SIMA 2 #MineDojo #ASKA #Minecraft #natural interaction

Can an AI really wander through a game without any human nudges? SIMA 2 tries to do just that, roaming unknown 3D worlds and making up its own plans as it goes. It runs on DeepMind’s Gemini integration, so unlike SIMA 1 it isn’t stuck with voice-command shortcuts - it actually reasons about what needs doing and picks actions on its own.

The system even attempts to talk back, spelling out what it’s thinking and handling follow-up questions, though the explanations sometimes miss the mark. DeepMind says the agent can carry what it learns over to other games, but the proof of that kind of generalisation is still fuzzy. Compared with the first version, the feel shifts from issuing commands to a sort of back-and-forth conversation, which the team highlighted.

We still don’t have hard numbers showing how well it improves without any human help. Its similarity to Nvidia’s Voyage hints at a wider move in the field, yet it’s unclear whether SIMA 2’s reasoning will hold up across a broad set of virtual settings. In short, the demo shows clear steps forward, but the limits are front-and-center.

Common Questions Answered

How does SIMA 2’s ability to explain its intentions differ from SIMA 1?

SIMA 2 can narrate its goals, describe intermediate steps, and answer follow‑up questions, whereas SIMA 1 was limited to simple voice commands. Although its explanations are not flawless, they are considerably more detailed and cooperative than those of SIMA 1.

What role does DeepMind’s Gemini integration play in SIMA 2’s performance?

The Gemini integration provides the underlying reasoning engine that lets SIMA 2 plan autonomously and verbalize its intent. This architecture moves the agent beyond the command‑only interface of SIMA 1, enabling richer interaction and self‑directed exploration.

In what way was SIMA 2 tested on unfamiliar games, and what environment was used?

Researchers evaluated SIMA 2 by placing it in games it had never seen before, using a Minecraft‑based environment as a benchmark. The agent had to navigate the 3D world, formulate its own plans, and explain its actions without any human hints.

Can SIMA 2 transfer knowledge learned in one game to other new games?

Yes, the article notes that SIMA 2 demonstrates a capacity to transfer what it learns to new games, suggesting generalization beyond a single title. This ability is a key advantage over earlier agents that relied on pre‑programmed rules for each specific game.