Editorial illustration for Developer Replaces LLM Wiki With Pure Python Compiler, Citing Over-Engineering
Developer Ditches LLM Wiki for Pure Python Compiler
Developer Replaces LLM Wiki With Pure Python Compiler, Citing Over-Engineering
What if you could build a structured, cross-referenced personal wiki without ever calling an LLM or touching an API? That’s exactly what one developer set out to prove by replacing a complex agent-driven system with a lean, deterministic compiler written in pure Python. The goal was simple: take a folder of messy, inconsistent text notes and turn them into a polished, interlinked knowledge base, using nothing but the standard library.
This approach strips away the non-determinism and recurring costs of model-based systems, focusing instead on parsing, graph-building, and linting. The resulting pipeline is fast, reproducible, and entirely self-contained. It handles real-world messiness without breaking, scales predictably, and preserves hand-written content across recompiles. And it does all this without a single network call.
The problem with agent-driven wikis The idea of using an LLM to build and maintain a personal wiki isn't new, and it isn't mine. It gained serious traction after Andrej Karpathy described the pattern in a widely shared post, where he explained that he was spending less of his token budget generating code and more of it building structured, persistent knowledge bases out of his research notes. He followed up with a public "idea file" laying out the architecture in more depth, and explicitly compared the process to compilation: raw sources go in, a structured, cross-referenced wiki comes out, and the LLM is the thing doing the compiling [1][2].
I think that compilation framing is exactly right. I just don't think an LLM needs to be the compiler. If your raw source is already local, already text, and already deterministic, routing it through a probabilistic system to organize it introduces three costs that a parser or a compiler simply doesn't have: Cost: Every time you add a new document, an agent-driven wiki re-reads content, decides what changed, and rewrites pages.
Why this matters
We're witnessing a quiet but significant pushback against the assumption that every knowledge problem requires an LLM. This developer's journey, from agent loops back to deterministic compilation, reveals something crucial about our field's current moment: not every task needs probabilistic reasoning when deterministic parsing will do. For developers and founders building tools in this space, it's a reminder that sometimes the most elegant solution isn't the most complex one.
It's the one that just works, every time, without API calls or hidden randomness. This isn't an argument against LLMs altogether, but rather a call to use them where they truly add value, not just because they're the shiny new tool in the box.
Common Questions Answered
Why did the developer replace their LLM-based wiki with a pure Python compiler?
The developer replaced the LLM-based system to eliminate non-determinism and reduce recurring API costs associated with agent-driven wikis. By using a lean, deterministic compiler written in pure Python with only the standard library, they could transform messy text notes into a polished, interlinked knowledge base without the complexity and expense of calling an LLM or API.
What is the main problem with agent-driven wikis that Andrej Karpathy identified?
According to Karpathy's widely shared post, agent-driven wikis consume significant token budgets that could be better allocated to other tasks. Karpathy demonstrated that he was spending more of his token budget building structured, persistent knowledge bases from research notes rather than generating code, highlighting the inefficiency of LLM-based approaches for this use case.
How does the pure Python compiler approach differ from using an LLM for wiki creation?
The pure Python compiler uses deterministic parsing to process inconsistent text notes into a structured knowledge base, whereas LLM-based approaches rely on probabilistic reasoning and non-deterministic outputs. This deterministic method eliminates the unpredictability and API dependencies of agent loops while proving that not every knowledge problem requires an LLM to solve effectively.
What is the key insight about knowledge management tools that this developer's approach reveals?
The developer's journey demonstrates that deterministic parsing solutions can be more elegant and effective than complex LLM-based systems for certain tasks like wiki creation. This challenges the current industry assumption that every knowledge problem requires probabilistic reasoning, showing that sometimes simpler, deterministic approaches are the most appropriate solution.