Editorial illustration for Grab, CJ ENM, LiveKit praise Gemini 3.5 Live Translate for quality and accuracy
Grab, CJ ENM, LiveKit praise Gemini 3.5 Live Translate...
Grab, CJ ENM, LiveKit praise Gemini 3.5 Live Translate for quality and accuracy
Twenty years ago Google turned a machine‑learning experiment into a service that now translates over a trillion words each month for billions of users. The next chapter arrives with Gemini 3.5 Live Translate, an audio model that tackles speech‑to‑speech conversion in real time. It recognises more than 70 languages and outputs translated audio that keeps the original speaker’s intonation, pacing and pitch—so the result sounds like a single, fluid conversation rather than a series of stop‑and‑go snippets.
Unlike traditional turn‑by‑turn systems, 3.5 Live Translate streams speech continuously, trading a tiny lag for smoother delivery; the output stays just a few seconds behind the source speaker. Google is rolling the feature out today across its ecosystem. Developers can test it in public preview through the Gemini Live API and Google AI Studio, enterprises get a private preview in Google Meet this month, and anyone with the Google Translate app on Android or iOS can try it immediately.
The model also processes multilingual input without manual settings and is built to handle noisy environments, promising a more seamless cross‑language experience.
Read the early reviews In addition to Grab, companies like CJ ENM, LiveKit and others have shared positive feedback on 3.5 Live Translate highlighting its impressive translation quality, accuracy and low latency: Experience 3.5 Live Translate in your video meetings Speech translation in Google Meet will soon use 3.5 Live Translate, improving the experience by: - Offering 70+ languages, an improvement from the previous limit of just five languages, - Enabling conversations across over 2000+ language combinations in one meeting, expanding from the previous state of only translating to and from English, - Updating the interface to provide instant access to speech translation.
Why this matters
We’ve seen Google’s translation pipeline evolve from a modest experiment two decades ago to a service that now handles over a trillion words each month, and Gemini 3.5 Live Translate is the latest checkpoint. The model promises fluid, natural‑sounding speech conversion with low latency, a claim backed by early adopters such as Grab, CJ ENM, and LiveKit, who cite impressive quality and accuracy. For developers building multilingual video meetings or real‑time collaboration tools, the reduced lag could simplify architecture, removing the need for separate transcription and translation stages.
Researchers may find the audio‑only focus a useful testbed for studying end‑to‑end speech translation without text intermediaries. Yet, the reports are anecdotal; we lack systematic benchmarks or independent evaluations, so it remains unclear whether the model will maintain performance across less common language pairs or noisy environments. Founders should weigh the convenience against potential vendor lock‑in, while our community watches to see if the promised fluidity translates into consistent, production‑grade reliability.
Further Reading
- Bringing state-of-the-art Gemini translation capabilities to Google Translate - Google Blog
- Google Gemini LLM - LiveKit Documentation - LiveKit Docs
- Build a Multilingual Voice Agent with Gemini Live API and LiveKit - Growwstacks
- Can Gemini Translate in Real Time? (Yes, But There's a Catch) - Maestra