Laptop screen displays RECAP output showing Claude 3.7’s 3,000‑word excerpts from The Hobbit and Harry Potter.

RECAP tool shows Claude 3.7 reproduces ~3,000 words from The Hobbit and Harry Potter

November 12, 2025 • 2 min read

When I first saw the RECAP benchmark, I thought it was just another test. Turns out it actually pulls back the curtain on how much copyrighted prose today’s language models can spit out. The tool asks a model to “recap” a passage without naming the source, then checks whether any sentences line up word-for-word with known books.

Researchers ran it on a handful of high-profile systems - Claude 3.7 from Anthropic, for example - and measured the output against an older baseline called earl. The gap was pretty stark. By scanning for exact matches, they could count how many snippets from well-known works resurfaced in the answers.

It's still unclear whether those matches are accidental or systematic, but it gives us a concrete number for something we’d only guessed at before: LLMs may be stitching together sizable blocks of protected text, not just rephrasing general facts. The figures that came out are enough to raise eyebrows, and they suggest we still have a lot to learn about the limits of so-called “creative” generation.

In testing, RECAP was able to reconstruct large portions of books like "The Hobbit" and "Harry Potter" with striking accuracy. For example, the researchers found that Claude 3.7 generated around 3,000 passages from the first "Harry Potter" book using RECAP, compared to just 75 passages found by earlier methods. Implications for Copyright Law To test RECAP's limits, the team introduced a new benchmark called "EchoTrace" that includes 35 complete books: 15 public domain classics, 15 copyrighted bestsellers, and five recently published titles that were definitely not part of the models' training data.

They also added 20 research articles from arXiv. The results showed that models could reproduce passages from almost every category, sometimes nearly word for word, except for the books the models hadn't seen during training.

New RECAP tool exposes just how much copyrighted text LLM's can regurgitate - THE DECODER

Related Topics: #RECAP #Claude 3.7 #LLM #EchoTrace #The Hobbit #Harry Potter #Anthropic #arXiv #Copyright Law

The RECAP study shines a light on just how much text large language models can hold onto. By feeding feedback through a chain of models, a team from Carnegie Mellon and Instituto Superior Técnico managed to get Claude 3.7 to spit out roughly 3,000 passages from the first *Harry Potter* book - a jump from the dozens that showed up in earlier checks. The authors also point out that sizable chunks from *The Hobbit* came back with a surprisingly high level of detail.

So, what does this mean for copyright enforcement? It seems memorization isn’t just a one-off anecdote; you can actually put a number on it. Still, the paper doesn’t claim every model behaves the same way, nor that the reproduced snippets would hold up in court. The tool is fresh, and we’re not sure how it scales to bigger, messier corpora or to works that aren’t as well-known.

That said, the findings do raise real worries for future lawsuits, even if the exact legal fallout is still murky. We’ll need more work to see whether RECAP’s approach can become a go-to benchmark for measuring model memorization.

Common Questions Answered

How does the RECAP benchmark evaluate Claude 3.7's ability to reproduce copyrighted prose?

RECAP prompts Claude 3.7 to "recap" a text without revealing its source, then checks the output for verbatim passages. The study found the model reproduced roughly 3,000 passages from the first Harry Potter book, far exceeding earlier methods.

What were the key findings of the RECAP study regarding excerpts from The Hobbit?

The researchers observed that Claude 3.7 generated sizable excerpts from *The Hobbit* with striking accuracy when using RECAP. These excerpts were comparable in length to the Harry Potter passages, highlighting the model's extensive memorization of copyrighted material.

What is the EchoTrace benchmark and how does it relate to the RECAP experiments?

EchoTrace is a supplemental benchmark introduced by the team that includes 35 complete books, 15 of which are public‑domain classics. It was used to test the limits of RECAP, providing a broader context for measuring how many passages models can recall across diverse texts.

Which institutions conducted the RECAP research and what implications does it have for copyright enforcement?

The study was carried out by researchers at Carnegie Mellon University and Instituto Superior Técnico. Their findings suggest that large language models can retain and reproduce large swaths of copyrighted text, raising significant challenges for current copyright law and enforcement strategies.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

RECAP tool shows Claude 3.7 reproduces ~3,000 words from The Hobbit and Harry Potter

Common Questions Answered

How does the RECAP benchmark evaluate Claude 3.7's ability to reproduce copyrighted prose?

What were the key findings of the RECAP study regarding excerpts from The Hobbit?

What is the EchoTrace benchmark and how does it relate to the RECAP experiments?

Which institutions conducted the RECAP research and what implications does it have for copyright enforcement?

Most Popular

Rob Pike’s AI‑generated ‘act of kindness’ spams draft tribute to his work

Meta adds Spotify AI music, Kannada/Telugu, and noise filtering to AI Glasses

Qwen‑Image‑2512 launches, rivals Google’s Nano Banana Pro in AI image generation

Fusion reactors could produce dark‑sector particles via neutron emissions

Gemini 3 Flash Offers Fast Multimodal Reasoning for Video, Data, Visual Q&A

OpenAI Opens Submissions for Apps Using ChatGPT’s SDK, Unveiled at DevDay

OpenAI launches App Directory, accepts ChatGPT apps with privacy notices

Sora 2 Generates Disturbing AI Kid Videos as Legal Grey Area Persists

Dell and NVIDIA host AI developer meetup in Bengaluru on deployment trade‑offs

NeuroPixel.AI draws global brands with production‑ready design automation tools

Related Reading

Consensus uses GPT-5 and Responses API to speed scientific research

Developers say Sora, unlike Vine/TikTok, is not about people in social media

Google AI Advisors Let Users Probe Performance with Conversational “Why” Queries

Anthropic reports first AI‑orchestrated large‑scale cyberattack; most blocked

Anthropic moves to block unauthorized Claude use by rivals and third parties

Common Questions Answered

How does the RECAP benchmark evaluate Claude 3.7's ability to reproduce copyrighted prose?

What were the key findings of the RECAP study regarding excerpts from *The Hobbit*?

What is the EchoTrace benchmark and how does it relate to the RECAP experiments?

Which institutions conducted the RECAP research and what implications does it have for copyright enforcement?

Most Popular

Rob Pike’s AI‑generated ‘act of kindness’ spams draft tribute to his work

Meta adds Spotify AI music, Kannada/Telugu, and noise filtering to AI Glasses

Qwen‑Image‑2512 launches, rivals Google’s Nano Banana Pro in AI image generation

Fusion reactors could produce dark‑sector particles via neutron emissions

Gemini 3 Flash Offers Fast Multimodal Reasoning for Video, Data, Visual Q&A

OpenAI Opens Submissions for Apps Using ChatGPT’s SDK, Unveiled at DevDay

OpenAI launches App Directory, accepts ChatGPT apps with privacy notices

Sora 2 Generates Disturbing AI Kid Videos as Legal Grey Area Persists

Dell and NVIDIA host AI developer meetup in Bengaluru on deployment trade‑offs

NeuroPixel.AI draws global brands with production‑ready design automation tools

How does the RECAP benchmark evaluate Claude 3.7's ability to reproduce copyrighted prose?

What were the key findings of the RECAP study regarding excerpts from The Hobbit?