Editorial illustration for Nvidia AI Agent Trains Robots Autonomously, Editing Code from Papers
Nvidia AI Agent Trains Robots Autonomously, Editing Code...
Nvidia AI Agent Trains Robots Autonomously, Editing Code from Papers
Why does this matter? Because robots still stumble on the kind of grasping a human takes for granted. While the tech is impressive, most real‑world experiments still need a person to collect data, reset the scene and fine‑tune algorithms after every try.
That bottleneck slows progress. Here’s the thing: Nvidia, Carnegie Mellon University and UC Berkeley have built ENPIRE, a system that hands those chores to an AI coding agent. A fleet of eight robots now runs a feedback loop on actual hardware—reset, act, evaluate, improve—without human scoring.
The agent even writes its own reward function after watching just a few minutes of success and failure videos. For a pin‑insertion task it combined visual alignment, gripper height and estimated force; for closing a cable tie it fused two camera views and cut reaction time to under 150 ms. The result?
The robots hit up to 99 percent success on tasks that were previously “tricky.” It’s a concrete step toward letting machines teach themselves the nuances of dexterous manipulation.
In the second phase, the agent works entirely on its own. It reads research papers, forms hypotheses, and edits the training code directly. It uses methods like behavior cloning, where the strategy mimics human demonstrations, or reinforcement learning, where the strategy improves through trial and error.
The agent picks the method itself based on real-world success signals. A robot fleet that coordinates through Git ENPIRE scales to a full fleet: eight dual-arm YAM robot stations, each with its own hardware, computer, and coding agent. The agents test different hypotheses at the same time and share results only through Git, the standard version control tool for software.
Why this matters We’ve seen robots learn from human‑written datasets for years, but Nvidia’s latest work pushes the loop further. An AI coding agent now reads academic papers, drafts hypotheses, and rewrites the training code without human touch. The eight‑robot fleet achieved up to 99 percent success on dexterous grasping tasks, a striking figure that suggests the approach can handle “tricky” manipulations.
Yet the experiments remain confined to a controlled lab, and it is unclear whether the same autonomy will survive in more variable environments. The system leans on behavior cloning and reinforcement learning, methods already familiar to many of us, but the code‑editing step introduces a new layer of complexity that could hide subtle bugs. For developers, the prospect of an agent that iterates on its own code may shorten the research‑to‑deployment cycle, but we should watch for hidden maintenance costs.
Founders might view the high success rate as a proof point, yet scaling the pipeline beyond eight machines will require further validation. Researchers can now explore whether self‑modifying agents can generalise beyond grasping, but the path forward is still uncertain.
Further Reading
- Nvidia Research Breakthrough Puts New Spin on Robot Training with Eureka - NVIDIA Blog
- New Nvidia AI agent, powered by GPT-4, can train robots - VentureBeat
- NVIDIA Releases Major Collection of Open Source Agent Tools and Skills for Physical AI - NVIDIA Newsroom
- How Nvidia AI Robot Trained 42 Years In 32 Hours And Did THIS - YouTube