Illustration for: AI trained on two books mimics famous authors, beats human imitators
Research & Benchmarks

AI trained on two books mimics famous authors, beats human imitators

2 min read

Feeding an AI just two books can be enough to make it sound like a famous writer. Researchers from Stony Brook University and Columbia Law School trained separate models on a pair of works by each author, then asked volunteers to pick which passage felt more authentic. Most people, surprisingly, chose the machine-generated text over pieces written by professional imitators.

The vote tally tipped in favor of the AI excerpts. The result pushes a tricky copyright question, especially as lawsuits over AI-created content wind through U.S. courts.

If a program can copy a recognizable style from such a tiny dataset, the line between inspiration and infringement starts to look fuzzy. The study isn’t saying robots will replace writers; it simply shows that a handful of pages can convince casual readers that the voice is genuine. I’m left wondering how long that “handful” can stay safe before the law catches up.

A new study shows that AI models fine-tuned on just two books can generate writing in the style of famous authors that readers prefer over work by professional imitators. The results could impact copyright law and ongoing lawsuits in the US. Researchers at Stony Brook University and Columbia Law School had professional writers and three major AI systems create passages in the style of 50 well-known authors, including Nobel Prize winner Han Kang and Booker Prize winner Salman Rushdie. A total of 159 participants, including 28 writing experts and 131 non-experts from the crowdsourcing platform Prolific, judged the passages without knowing whether a human or an AI had written them.

Related Topics: #AI #fine‑tuned #copyright law #Stony Brook University #Columbia Law School #Han Kang #Salman Rushdie #Prolific

Two books teaching a machine an author’s voice sounds wild, but the study says it works. By fine-tuning AI on just a pair of titles, researchers got passages that many readers preferred over work from professional imitators. The team - folks from Stony Brook University and Columbia Law School - ran three big AI systems side by side with human writers, aiming at the styles of 50 well-known authors, from Nobel laureate Han Kang to Booker winner Salman Rushdie.

In blind surveys, participants kept picking the AI text, which hints the models can latch onto key stylistic clues with surprisingly little material. Still, the training set was tiny, so I wonder how far this will stretch to other genres or to writers with fewer books. The legal side is fuzzy, too; it’s not clear whether such outputs bite on copyright or how they might shift existing lawsuits.

So, the results give us a solid data point in the AI-authorship debate, but they certainly don’t close the bigger policy or ethical questions around machine-crafted literature.

Common Questions Answered

How many books were used to fine‑tune the AI models that mimicked famous authors?

The study fine‑tuned each AI model on just two books by a target author. Despite the minimal data, the models could generate prose that resembled the authors' distinctive styles.

Which institutions conducted the research on AI‑generated author imitations?

Researchers from Stony Brook University and Columbia Law School carried out the experiments. They collaborated to evaluate both the technical performance of the AI and its legal implications.

What was the comparative preference of readers between AI‑generated passages and those written by professional imitators?

In blind surveys, participants consistently ranked the AI‑generated passages higher than the human‑written imitations. This preference held across the 50 celebrated authors tested, including Han Kang and Salman Rushdie.

Why might the study’s findings affect ongoing copyright lawsuits in the United States?

Because the AI could replicate an author’s voice from only two books, the results raise questions about the scope of copyright protection for literary style. Law scholars suggest this could influence how courts treat AI‑generated works in future litigation.