Skip to main content
ELI benchmark report reveals top large language models demonstrating resistance to Russian propaganda, highlighting advanced

Editorial illustration for ELI releases LLM benchmark showing top models resist Russian propaganda

ELI releases LLM benchmark showing top models resist...

ELI releases LLM benchmark showing top models resist Russian propaganda

3 min read

Why does this matter now? As conversational AIs become default sources for quick answers, policymakers fear they could amplify state‑sponsored disinformation. In response, Estonia’s language research body has unveiled a new evaluation suite that pits dozens of large language models against a set of test prompts designed to expose susceptibility to Russian‑originated talking points.

Working with the volunteer defence group Propastop, the institute identified fourteen thematic buckets where Moscow’s influence campaigns are most active—ranging from the legal status of Crimea and the justification for the Ukraine conflict to narratives about NATO’s role and the historic annexation of the Baltic region during World War II. Each model receives a score reflecting how often it refrains from endorsing or elaborating on these contested frames. The results highlight a handful of systems that consistently avoid taking a stance, while many popular offerings still drift into the gray zone.

The benchmark offers a concrete, data‑driven way for developers and regulators to gauge where AI‑driven dialogue might unintentionally echo hostile propaganda.

To help combat this problem, the government-sponsored Estonian Language Institute (ELI) has released a new "Propaganda Resistance" benchmark ranking dozens of LLMs on their ability to avoid "tak[ing] positions on topics that the Russian Federation uses in its strategic narratives."

As a former member of the Soviet Union that has been independent for just a few decades, many Estonians are particularly alert to what they see as false narratives being promoted from their large and often belligerent neighbor to the east. Alongside volunteer-run Estonian defense collective Propastop, the ELI identified 14 broad categories in which it sees Russian influence operations trying to sway public discussion. These range from narratives on the current status of Crimea and justifications for the war in Ukraine to the history of NATO and justification for Russia's annexation of Baltic states during World War II.

For each category of propaganda, the researchers developed separate questions phrased to be neutral, biased with "false assumptions" based on Russian propaganda, or to maliciously attempt to elicit explicit misinformation from the LLM. Questions were provided to the models in English, Estonian, and Russian, and judged by a separate AI model (calibrated to align with Propastop experts) based on the models' ability to "push back on propaganda narratives, without external help" from web search or other external tools.

Read full article

Comments

Why this matters

We now have a concrete yardstick for measuring how well large language models keep Russian‑origin disinformation at bay. The Estonian Language Institute’s Propaganda Resistance benchmark ranks dozens of models, and the top performers appear to resist the targeted narratives. For developers, this offers a reference point when choosing a model for applications where political bias could be costly.

Founders can point to the rankings as part of risk‑assessment documentation, but they should remember that the test covers only one adversary’s output style. Researchers gain a new dataset to probe why certain architectures or training regimes succeed where others falter. Yet the benchmark’s scope is limited; it does not address other forms of misinformation, nor does it guarantee resilience in real‑world deployments where prompts evolve.

Moreover, the ranking methodology is not fully disclosed, leaving open questions about reproducibility. As we integrate these findings, we must stay cautious, validate results in our own pipelines, and monitor whether the claimed resistance holds up under broader scrutiny.

Further Reading