Engineer Verbeek Saves USD 20 Million Annually After Mother Flags Prompt Redundancy
When a senior engineer at a mid‑size AI firm traced a puzzling rise in compute costs, the culprit turned out to be something as mundane as duplicated wording in a prompt. The engineer, Jeroen Verbeek, had been tweaking the model’s instructions for months, yet the savings never materialised. It was only after his mother, who’d been reviewing his work for clarity, pointed out that identical directives appeared in several sections that the issue surfaced.
The team discovered that their primary prompt was being assembled on the fly from a patchwork of source files, each tweaked in isolation. As each fragment was refined, the overlapping language compounded, inflating token usage and, consequently, the bill. By consolidating the repetitive parts, the company slashed its annual spend by roughly $20 million.
The revelation sparked a broader audit of how prompts are built and maintained, underscoring the hidden cost of unchecked redundancy in generative‑AI pipelines.
Looking at the trace, Verbeek said his mother asked why certain instructions were repeated multiple times across different parts of the prompt. "What we realised is that our system prompt is constructed dynamically from lots of different files," Verbeek added. "As we've been optimising each part, no
Looking at the trace, Verbeek said his mother asked why certain instructions were repeated multiple times across different parts of the prompt. "What we realised is that our system prompt is constructed dynamically from lots of different files," Verbeek added. "As we've been optimising each part, no one had looked at the coherence for a while.
Together we found duplication, inconsistencies, and overly verbose formulations." He explained that over time, engineers had kept adding new instructions to emphasise specific behaviours, without removing or consolidating older ones. This led to unnecessary repetition and diluted the prompt's overall effectiveness. Verbeek said the team removed duplicate instructions, tightened the language, and preserved the original intent and balance of constraints.
After manually rewriting the first sections, he used an AI model to refactor the remaining portions in the same style, followed by a detailed line-by-line review to reintroduce a few critical safeguards. The revised prompt was then A/B tested over the New Year period. According to Verbeek, the updated system followed instructions more reliably, responded faster, and significantly reduced token usage, leading to substantial cost savings at scale.
Did a simple prompt tweak really shave $20 million off Lovable’s bills? The answer, according to staff engineer Benjamin Verbeek, appears to be yes. By reviewing the system prompt over the holidays, he identified duplicated instructions that his mother flagged as redundant.
The prompt, he explained, is assembled from numerous files, and each optimization introduced unintended repetition. Removing that overlap reportedly made the platform run about four percent faster and cut annual large‑language‑model costs by nearly $20 million. The savings figure is striking, yet the article does not break down how the cost reduction was calculated or which usage metrics changed.
Moreover, it is unclear whether the same approach would yield comparable results in other AI‑driven services. Still, the episode underscores how seemingly minor prompt engineering can have measurable financial impact. Lovable’s experience suggests that routine audits of prompt construction may be worth considering, though broader applicability remains uncertain.
Future internal reviews may focus on similar redundancies across code bases.
Further Reading
- Papers with Code - Latest NLP Research - Papers with Code
- Hugging Face Daily Papers - Hugging Face
- ArXiv CS.CL (Computation and Language) - ArXiv
Common Questions Answered
How did Engineer Jeroen Verbeek discover the prompt redundancy that cost the company USD 20 million annually?
Verbeek noticed a puzzling increase in compute costs and, after his mother pointed out identical directives repeated across the prompt, he traced the system prompt's dynamic assembly from multiple files. This review revealed duplicated and overly verbose instructions that were inflating resource usage.
What specific changes were made to the system prompt to achieve a four‑percent speed improvement?
The team removed duplicated instructions and streamlined inconsistent wording that had accumulated as engineers optimized separate files. By consolidating these overlapping directives, the prompt became more coherent, reducing unnecessary processing and boosting execution speed by roughly four percent.
Why did the duplicated wording in the prompt cause such a large increase in compute costs for the mid‑size AI firm?
Each redundant instruction forced the language model to process the same directive multiple times, leading to extra compute cycles and higher energy consumption. Over time, these inefficiencies compounded, resulting in an estimated annual cost increase of about USD 20 million.
What role did the dynamic construction of the system prompt from many files play in the redundancy issue?
Because the prompt was assembled from numerous files, engineers added optimizations to individual sections without checking the overall coherence. This fragmented approach allowed duplicated and inconsistent instructions to slip through, creating the redundancy that later cost the company millions.