Skip to main content
AI-powered tool converting CSV data into engaging articles, showcasing 7 AI models and reader preference stats—Data2Story tra

Editorial illustration for Data2Story converts CSVs to articles with 7 AI; 53 readers prefer them to human

Data2Story converts CSVs to articles with 7 AI; 53...

Data2Story converts CSVs to articles with 7 AI; 53 readers prefer them to human

3 min read

Data2Story is a new AI pipeline that turns a raw CSV file into a verified, interactive news story. Built by researchers at Oxford and Stanford, the system runs as a Claude Code skill and coordinates seven specialized agents to generate research context, statistics, graphics and a clickable map. The output isn’t just a dump of numbers; every sentence, chart and interactive element is linked to its source—whether that’s a line of code, the original data file or an external URL—through an “Inspector” panel that presents the evidence as index cards.

The authors demonstrate the workflow on the 2026 FIFA World Cup schedule, a dataset that has seen little coverage so far. From the match list and host cities, the tool produces a climate‑focused article noting that roughly four in ten games fall in locations the players’ union FIFPRO rates as extremely high heat risk, with humidity, not temperature, driving the risk. The authors stress the figures reflect typical climate conditions, not a forecast for the tournament.

For images, video, and audio, the system pulls in OpenRouter models like gpt-5.4-image-2, seedance-2.0, and lyria-3-pro-preview. 53 readers rate agent articles higher than human originals The researchers paired 18 public datasets with matching human-written originals from three distinct sources. They used the concise briefings from The Economist, the lavishly designed long reads from The Pudding, and the community datasets from TidyTuesday.

53 recruited readers rated both versions across five categories, including visual design, narrative rhythm, data transparency, verifiability of claims, and insight gained. The biggest lead was in transparency, at +1.49 on a seven-point scale. Overall, 74 percent preferred the agent article, 25 percent the human version, and 2 percent called it a draw.

The agent won clearly in data-heavy Economist briefings and TidyTuesday pieces. Against Pudding reports, which design teams often spend weeks crafting, it was a statistical tie. When measuring which statements from the human-written article also appear in the agent-generated article, Data2Story covers about half.

Conversely, only 35 percent of the agent's statements are found in the human text. The agent adds plenty of its own angles but only partly captures the editorial core. The gap is widest in short, formulaic Economist briefings, where the agent reproduces 73 percent of human findings, likely because those texts hew closely to standard statistics the agent calculates anyway.

Where humans still win The researchers flag three areas where human authors stay ahead. On editorial perspective, reporters explain things the data can't.

Why this matters

We see a concrete attempt to shrink the weeks‑long grind of data journalism into a handful of automated steps. Data2Story, built by Oxford and Stanford researchers, strings together seven AI agents—including a Claude Code skill and OpenRouter models for images, video, and audio—to turn a raw CSV into an interactive article that still carries a verification tag. In a limited test, 53 readers rated the AI‑generated pieces higher than their human‑written counterparts, suggesting that the approach can meet, or even exceed, audience expectations for clarity and engagement.

Yet the study only covered 18 public datasets matched to three sources of human articles, leaving it unclear whether the system scales to more complex investigations or niche domains. Moreover, the claim of “verified” output hinges on the pipeline’s internal checks, a detail the report does not unpack. For developers and founders, the prototype offers a glimpse of how multi‑agent orchestration might streamline newsroom workflows, but we should remain cautious until broader evaluations confirm reliability across varied data stories.

Further Reading