Google launches Gemini Deep Research agent via new Interactions API
Google has opened a new route for developers who want more than a text‑completion engine. By exposing an Interactions API, the company lets external code call a single endpoint— /interactions — instead of juggling multiple services. The move signals a shift from treating large language models as isolated predictors toward embedding them in workflows that span several steps.
In practice, that means a developer can hand off a query, let the model fetch data, synthesize findings, and return a structured answer without writing custom glue code. The promise is especially relevant for tasks that stretch beyond a single prompt, such as academic literature reviews or market analyses, where “long‑horizon” reasoning is essential. While traditional models stop after predicting the next token, this infrastructure is designed to keep the conversation going, chaining actions and memory.
That backdrop frames the next point about Google’s first built‑in agent, Gemini Deep Research, and its native “Deep Research” and MCP support.
Native "Deep Research" and MCP Support Google is using this new infrastructure to deliver its first built-in agent: Gemini Deep Research. Accessible via the same /interactions endpoint, this agent is capable of executing "long-horizon research tasks." Unlike a standard model that predicts the next token based on your prompt, the Deep Research agent executes a loop of searches, reading, and synthesis. Crucially, Google is also embracing the open ecosystem by adding native support for the Model Context Protocol (MCP).
This allows Gemini models to directly call external tools hosted on remote servers--such as a weather service or a database--without the developer having to write custom glue code to parse the tool calls. The Landscape: Google Joins OpenAI in the 'Stateful' Era Google is arguably playing catch-up, but with a distinct philosophical twist. OpenAI moved away from statelessness nine months ago with the launch of the Responses API in March 2025.
While both giants are solving the problem of context bloat, their solutions diverge on transparency: OpenAI (The Compression Approach): OpenAI's Responses API introduced Compaction--a feature that shrinks conversation history by replacing tool outputs and reasoning chains with opaque "encrypted compaction items." This prioritizes token efficiency but creates a "black box" where the model's past reasoning is hidden from the developer. Google (The Hosted Approach): Google's Interactions API keeps the full history available and composable. The data model allows developers to "debug, manipulate, stream and reason over interleaved messages." It prioritizes inspectability over compression.
Supported Models & Availability The Interactions API is currently in Public Beta (documentation here) and is available immediately via Google AI Studio. It supports the full spectrum of Google's latest generation models, ensuring that developers can match the right model size to their specific agentic task: Gemini 3.0: Gemini 3 Pro Preview.
Google’s Interactions API reshapes how developers talk to models, shifting from single‑turn completions to ongoing, state‑aware dialogues. By keeping context alive across calls, the platform promises to reduce the overhead of re‑sending full histories and to support agents that can juggle tools and internal state. Gemini Deep Research, the first built‑in agent released on the /interactions endpoint, is billed as capable of “long‑horizon research tasks,” a step beyond the next‑token prediction of earlier models.
It can, in theory, persist information, chain reasoning steps and call external resources without restarting from scratch each time. Yet the article offers no performance metrics, so it’s unclear how reliably the agent handles truly complex investigations or how it compares to bespoke solutions built on the same API. The new interface appears to open doors for more autonomous AI applications, but developers will need to test its limits in real‑world settings before drawing firm conclusions about its practical impact.
Further Reading
- Google launches Interactions API to unify models and agents like Gemini Deep Research - Google Blog
- Build with Gemini Deep Research via the new Interactions API - Google Blog
- Release notes: Interactions API and Gemini Deep Research agent launched in Gemini API - Google AI for Developers
- Google launched its deepest AI research agent yet on the same day OpenAI dropped GPT-5.2 - TechCrunch
- Building agents with the ADK and the new Interactions API - Google Developers Blog
Common Questions Answered
What is the purpose of Google's new Interactions API and how does it differ from traditional text‑completion engines?
The Interactions API provides a single /interactions endpoint that lets developers call external code in a unified way, eliminating the need to manage multiple services. Unlike traditional text‑completion engines that only predict the next token, this API supports multi‑step workflows and state‑aware dialogues across calls.
How does the Gemini Deep Research agent perform "long‑horizon research tasks" through the Interactions API?
Gemini Deep Research runs a loop of searches, reading, and synthesis rather than a single token prediction, allowing it to gather information over multiple steps. By using the /interactions endpoint, it can maintain context and internal state, enabling comprehensive research that spans several queries.
What does "native Deep Research and MCP Support" mean for developers using the Gemini agent?
Native Deep Research means the agent is built into the platform and can execute complex research loops without additional integration work. MCP (Model‑Centric Programming) support allows developers to leverage built‑in tool usage and state management directly through the API.
In what ways does the Interactions API reduce overhead for developers compared to sending full conversation histories each time?
The API keeps context alive across calls, so developers no longer need to resend the entire dialogue history with each request. This state‑aware approach cuts bandwidth usage and simplifies code, making it easier to build agents that juggle multiple tools and internal state.
Why is the launch of Gemini Deep Research considered a shift from treating large language models as isolated predictors?
By embedding the model in a looped research workflow, Gemini Deep Research moves beyond isolated token prediction to act as an autonomous agent. This shift enables the model to interact with external data sources, maintain context, and deliver synthesized results over extended tasks.