Skip to main content
Andrej Karpathy's Autoresearch: AI code on screen, showing nightly test runs and open-source development.

Editorial illustration for Karpathy releases open-source Autoresearch, runs hundreds of AI tests nightly

Karpathy's Autoresearch: AI Testing Automated Overnight

Karpathy releases open-source Autoresearch, runs hundreds of AI tests nightly

3 min read

Andrej Karpathy just dropped Autoresearch, an open‑source framework that spins up hundreds of machine‑learning trials every night. The codebase, posted on GitHub, hooks into a cluster of GPUs and orchestrates data loading, model tweaks and evaluation without a human in the loop. In practice, a single developer can fire it up and watch a cascade of variations churn out results while they sleep.

The tool’s claim to fame isn’t just the raw volume of runs—it’s the way it rewrites the usual trial‑and‑error cycle into a continuous, automated pipeline. While the concept of automated experimentation isn’t new, Karpathy’s implementation pushes the tempo to “silicon speed,” according to his own description. But here’s the thing: the community on X has already started echoing the demo, asking whether such a workflow can become a standard part of research practice.

That conversation sets the stage for the following observation.

By automating the "scientific method" for code, Karpathy has turned machine learning into an evolutionary process that runs at the speed of silicon rather than the speed of human thought. And more than this, it showed the broader AI and machine learning community on X that this type of process could

By automating the "scientific method" for code, Karpathy has turned machine learning into an evolutionary process that runs at the speed of silicon rather than the speed of human thought. And more than this, it showed the broader AI and machine learning community on X that this type of process could be applied far beyond computer science, to fields like marketing, health, and, well, basically anything that requires research. Autoresearch spreads far and wide The reaction was swift and viral, with Karpathy's post garnering more than 8.6 million views in the intervening two days as builders and researchers scrambled to scale the "Karpathy loop".

Varun Mathur, CEO of AI tool aggregator platform Hyperspace AI, took the single-agent loop and distributed it across a peer-to-peer network. Every node running the Hyperspace agent became an autonomous researcher. On the night of March 8-9, 35 autonomous agents on the Hyperspace network ran 333 experiments completely unsupervised.

The results were a masterclass in emergent strategy: Hardware Diversity as a Feature: Mathur noted that while H100 GPUs used "brute force" to find aggressive learning rates, CPU-only agents on laptops were forced to be clever. These "underdog" agents focused on initialization strategies (like Kaiming and Xavier init) and normalization choices because they couldn't rely on raw throughput. Gossip-Based Discovery: Using the GossipSub protocol, agents shared their wins in real-time.

When one agent found that Kaiming initialization dropped loss by 21%, the idea spread through the network like a digital virus. Within hours, 23 other agents had incorporated the discovery into their own hypotheses. The Compression of History: In just 17 hours, these agents independently rediscovered ML milestones--such as RMSNorm and tied embeddings--that took human researchers at labs like Google Brain and OpenAI nearly eight years to formalize.

Will this modest 630‑line script reshape how researchers iterate? Karpathy’s autoresearch is now publicly available on GitHub under an MIT license, and it already claims to launch hundreds of AI experiments each night while its creator sleeps. The code is described as a simple automation of the scientific method, turning machine‑learning development into an evolutionary process that runs at the speed of silicon rather than human thought.

Yet the project is not a polished model or a corporate product; it is a proof‑of‑concept script that invites the broader community to experiment. Its ambition is clear, but whether the approach will scale beyond nightly test batches or become a staple in research pipelines remains uncertain. The community on X has taken note, acknowledging that such automation could be possible, but concrete evidence of long‑term impact is still lacking.

In short, Karpathy’s release offers a tangible step toward automated experimentation, though its practical significance will need to be demonstrated over time.

Further Reading

Common Questions Answered

How does Autoresearch automate machine learning experiments?

Autoresearch is a 630-line open-source framework that automatically runs hundreds of machine learning trials nightly using a GPU cluster. The tool orchestrates data loading, model variations, and evaluations without human intervention, effectively turning machine learning development into an automated, evolutionary process.

What makes Autoresearch unique in the AI research workflow?

Karpathy's Autoresearch transforms the traditional scientific method by running experiments at the speed of silicon instead of human thought. The framework allows a single developer to launch and monitor multiple AI trials simultaneously, potentially accelerating research and innovation across various fields like computer science, marketing, and health.

Where can developers access the Autoresearch framework?

Autoresearch is publicly available on GitHub under an MIT license, making it freely accessible to researchers and developers worldwide. The open-source nature of the project allows anyone to examine, use, and potentially contribute to the framework's development.