Editorial illustration for Andrej Karpathy Releases nanochat: Open-Source ChatGPT Clone
Open Source

Andrej Karpathy Releases nanochat: Open-Source ChatGPT Clone

6 min read

When I saw Andrej Karpathy drop nanochat, I was surprised by how quickly it turned a vague idea into a working codebase. It builds on his earlier nanoGPT, which only covered pre-training, and now adds the missing reinforcement-learning-from-human-feedback step that most chatbots need. The repo actually walks you through supervised fine-tuning, reward modeling and PPO training - basically the whole recipe for a decent conversational model.

The thing that sticks out is how stripped-down it is. Karpathy seems to have boiled the RLHF pipeline down to a few tidy scripts that even a laptop can handle. He uses Pythia-160M as the starting point, showing that you don’t need a massive model to get chat-like behavior if you align it properly. That probably lowers the entry barrier for anyone who wanted to tinker with RLHF but thought it was reserved for big labs.

It also feels like a natural continuation of his open-source habit. NanoGPT became a go-to reference; nanochat looks set to do the same for alignment experiments, tutorials and spin-offs. By sharing the full stack, Karpathy lets more people peek under the hood and maybe push the field forward - something that benefits the whole community.

OpenAI co-founder and Eureka Labs founder, Andrej Karpathy, has released nanochat, an open-source project that provides a full-stack training and inference pipeline for a simple ChatGPT-style model. The repository follows his earlier project, nanoGPT, which focused only on pretraining. Link to the GitHub repository.

In a post on X, Karpathy said, “You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.” The repo consists of about 8,000 lines of code and covers the entire pipeline. It includes tokeniser training in Rust and pretraining a Transformer LLM on FineWeb. The pipeline also handles mid-training on user-assistant conversations and multiple-choice questions, supervised fine-tuning (SFT), and optional reinforcement learning (RL) with GRPO.

Finally, it supports efficient inference with KV caching. Users can interact with the model through a command-line interface or a web UI, and the system generates a markdown report summarising performance. Karpathy explained that the models can be trained at different scales depending on time and cost.

A small ChatGPT clone can be trained for around $100 in roughly 4 hours on an 8×H100 GPU node, allowing basic interaction. Training for about 12 hours enables the model to surpass the GPT-2 CORE benchmark.

Related Topics: #nanochat #Andrej Karpathy #ChatGPT #RLHF #OpenAI #nanoGPT #PPO training #Pythia-160M #open-source #reinforcement learning

If you’ve been eyeing conversational AI but felt the entry barrier was too steep, nanochat probably feels like a breath of fresh air. Karpathy keeps doing his thing, turning what could be a research demo into something you can actually play with. Where nanoGPT handed us the pre-training block, nanochat adds the fine-tuning and a simple UI that let you run a real chat, even if you’re still figuring out the details.

The timing feels right, too. More teams are now tinkering with niche assistants or custom training pipelines, and a tidy reference implementation can shave days, or weeks, off the engineering grind. The community that grew around Karpathy’s earlier releases has already started tossing in tweaks, better docs, and little extensions that make the whole stack feel a bit more polished.

This isn’t pitched as the next ChatGPT challenger; it’s more about giving developers a clear path from raw model to working interface. If you’re curious about the steps between pre-training and a usable chat window, nanochat acts like a hands-on guide. The real payoff will show up as people remix those pieces for their own projects, keeping the open-source spirit alive with code that’s easy to read and adapt.

Common Questions Answered

How does nanochat expand upon Andrej Karpathy's previous nanoGPT project?

nanochat provides a complete full-stack pipeline that includes supervised fine-tuning, reward modeling, and PPO training, which were missing from the earlier nanoGPT project that focused solely on pretraining. This expansion delivers the entire recipe needed for creating a functional chatbot, including the crucial reinforcement learning from human feedback (RLHF) stage.

What specific training stages are included in the nanochat pipeline for creating a ChatGPT-style model?

The nanochat pipeline covers three key stages: supervised fine-tuning to adapt the base model, reward modeling to learn human preferences, and PPO training for the final reinforcement learning from human feedback (RLHF) alignment. These stages are essential for transforming a pretrained language model into a conversational AI that can interact in a ChatGPT-like manner.

What is the practical benefit for developers who want to create their own conversational AI using nanochat?

nanochat significantly lowers the barrier to entry by providing an accessible, open-source implementation that developers can run on a cloud GPU box with a single script. According to Karpathy, this process can yield a functional, personal LLM for chat applications in as little as four hours, making hands-on experience with conversational AI much more attainable.