Editorial illustration for Gemini uses Model Context Protocol to book Uber and order DoorDash
Gemini uses Model Context Protocol to book Uber and...
Gemini uses Model Context Protocol to book Uber and order DoorDash
Why does this matter? Because a conversational AI that can summon an Uber or fire off a DoorDash order without opening another app changes how we think about “chat‑first” experiences. Gemini, Google’s flagship model, now advertises the ability to complete those transactions from a single screen.
The trick isn’t magic; it’s a piece of middleware that lets the model speak the same language as the services it contacts. While the tech is impressive, the user never sees the back‑and‑forth between Gemini and the third‑party APIs. Instead, the interface jumps straight to the checkout confirmation you’d expect after you tap “confirm.” In other words, the heavy lifting happens behind the scenes, invisible to the end‑user.
That invisible layer is what the quote below refers to—the open‑source protocol that makes the hand‑off possible.
If there's a Model Context Protocol (MCP) integration--the open source universal language that lets LLMs talk to third‑party apps--then Gemini can run the task in the backend. (In this instance, you wouldn't see the whole process play out; you'd just see the final checkout step appear after you make
If there's a Model Context Protocol (MCP) integration--the open source universal language that lets LLMs talk to third-party apps--then Gemini can run the task in the backend. (In this instance, you wouldn't see the whole process play out; you'd just see the final checkout step appear after you make the request.) There are also "App Functions" developers can build that allows Gemini to interface with it in a structured way. But if neither exist, then that's when Gemini can open the app itself and navigate through the buttons, text boxes, and menus to complete tasks.
"This is the first time we're doing this on Android with applications, and so getting this right is really important," Samat says. "I think it's an exciting step forward in technology. We sort of view this as the beginning of a new era of mobile intelligence, and Android is where we think you see the future first." Privacy concerns abound when it comes to granting Gemini access to your apps.
Samat says that's why Google hasn't included any overly sensitive apps in this first batch for task automation. He says this data is not used for advertising, and that users can delete the data that Gemini sees. "We do think it's really important that people have trust in the system, and that comes from having control and transparency of what it's doing." While the smartphone screen is still required at the moment to complete the task, Samat envisions a future where you can start these tasks through other devices--say, a pair of smart glasses, an AI pendant, or even a car.
(There will be several new Android XR-powered smart glasses launching this year.) He says the company is looking at other ways to make the final authentication capable on these other devices.
Can a chatbot really handle a ride‑hailing request without opening the app? Gemini says it can, thanks to an open‑source Model Context Protocol that lets the model speak directly to Uber and DoorDash. Unlike Siri’s half‑hearted attempt that merely opened the Uber app, or Google Assistant’s clunky “order my usual” that was later pulled, Gemini performs the transaction in the background and only presents the final checkout screen.
The article notes the backend work is hidden from the user, which could simplify the experience. Yet the description leaves open how authentication, error handling, or user consent are managed; those details remain unclear. Moreover, the claim hinges on the presence of an MCP integration—if a third‑party service does not adopt the protocol, Gemini’s ability to act may be limited.
The approach marks a step toward more seamless AI‑driven actions, but practical reliability and privacy safeguards have yet to be demonstrated. As the technology matures, its real‑world usefulness will need careful evaluation.
Further Reading
- Google builds Gemini right into Android, adding contextual awareness within apps - Engadget
- Papers with Code - Latest NLP Research - Papers with Code
- Hugging Face Daily Papers - Hugging Face
- ArXiv CS.CL (Computation and Language) - ArXiv