Editorial illustration for Thinking Machines develops AI that processes input and replies simultaneously
Thinking Machines develops AI that processes input and...
Thinking Machines develops AI that processes input and replies simultaneously
Thinking Machines Lab, the startup Mira Murati launched after leaving OpenAI, unveiled a new class of “interaction models” on Monday. The core idea is simple yet unfamiliar: an AI that can listen and talk at the same time, mimicking the flow of a phone call rather than the stop‑start of a text chat. The company calls this capability “full duplex.” Its first prototype, TML‑Interaction‑Small, reportedly generates a reply in 0.40 seconds—about the pace of a natural human exchange and noticeably quicker than comparable offerings from OpenAI and Google, according to the firm’s own benchmarks.
Still, the technology is in a research preview stage, not a consumer product. A limited preview is slated for the next few months, with a broader rollout expected later in the year. While the numbers look promising, the real‑world experience remains untested.
The move raises questions about whether native interactivity will translate into a smoother conversational feel once the model reaches a wider audience.
Why this matters
We see Thinking Machines Lab attempting a shift from the classic turn‑taking dialogue model to a “full duplex” interaction where the AI listens and talks simultaneously. If TML‑Interaction‑Small truly delivers responses in 0.4 seconds while still processing incoming speech, developers could prototype more fluid conversational agents without the latency of back‑and‑forth exchanges. Founders may wonder whether this architecture reduces the need for complex state‑management logic, yet the article offers no data on accuracy or resource consumption, leaving open the question of scalability.
Researchers will have a new benchmark to test: can an AI maintain coherence while interrupting its own output? The claim sounds promising, but without independent evaluation we cannot confirm whether the model handles overlapping inputs without degradation. Moreover, the brief description does not address how the system deals with ambiguous or conflicting cues when both streams operate together.
As we explore these interaction models, we should remain cautious, tracking real‑world performance before assuming they will redefine conversational AI design.
Further Reading
- AINews Thinking Machines' Native Interaction Models - Latent Space
- Thinking Machines Lab Unveils 'Interaction Models' for Real-Time Multimodal AI - TechCrunch
- Mira Murati's Thinking Machines Lab Previews Full-Duplex AI Interaction - The Verge
- Defeating Nondeterminism in LLM Inference - Thinking Machines Lab