Meta engineer demonstrates multilingual AI on stage, screen showing map with language icons and sub‑10% error chart.

Meta's Omnilingual ASR hits sub‑10% error on 78% of 1,600 languages

November 10, 2025 • 2 min read

When Meta demoed Omnilingual at its recent AI showcase, the first thing that struck me was the sheer number of languages it claims to cover. The model was trained on audio from roughly 1,600 different tongues - a scale that makes the handful of languages most commercial speech tools support look almost quaint. It’s impressive on paper, but I can’t help wondering how well it actually understands any given language, especially those with only a few hours of recorded speech.

Past research shows recognizers tend to stumble when data are scarce, which often leaves smaller linguistic communities out of voice-enabled tech. Meta’s engineers say they tried to close that gap by gathering more data and building a bigger model, yet the real test will be the accuracy numbers they publish.

---

According to Meta, Omnilingual ASR delivers a character error rate below 10 for 78 percent of the 1,600 languages tested. For languages with at least ten hours of training audio, 95 percent hit this mark or better. Even for "low-resource" languages with less than ten hours of audio, 36 percent fall below the 10 character error rate threshold.

To support further research and real-world use, Meta has also released the Omnilingual ASR Corpus, a large dataset of transcribed speech in 350 underrepresented languages. This data, available under a Creative Commons (CC-BY) license, is meant to help developers and researchers build or adapt speech recognition models for specific local needs. Scaling to new languages with in-context learning A key feature of Omnilingual ASR is its "Bring Your Own Language" option, which uses in-context learning.

Meta's Omnilingual ASR brings speech recognition to 1,600 languages - THE DECODER

Related Topics: #Omnilingual ASR #Meta #speech AI #character error rate #low-resource #in-context learning #Creative Commons #underrepresented languages

Meta’s Omnilingual ASR now says it can handle more than 1,600 languages - a big jump from the few hundred that most speech-recognition tools support. The company reports a character error rate under 10 % for about 78 % of those languages. If you look at the group that has at least ten hours of training audio, roughly 95 % hit that same mark.

The picture gets fuzzier for low-resource languages with under ten hours of data: only about 36 % reach the sub-10 % error rate, leaving a lot of tongues still lagging. Whether this level of accuracy will hold up in everyday use is still an open question. The article doesn’t explain how the model deals with dialect differences, noisy backgrounds, or real-world user interaction.

It also mentions that 500 of the 1,600 languages … but the sentence cuts off, so we can’t gauge depth of coverage. The headline numbers look promising for broader linguistic inclusion, yet the missing details on testing conditions and long-term stability make me lean toward cautious optimism rather than full endorsement.

Common Questions Answered

What character error rate does Meta's Omnilingual ASR achieve for the majority of the 1,600 languages it was tested on?

Meta reports that Omnilingual ASR attains a character error rate (CER) below 10 % for 78 % of the roughly 1,600 languages evaluated. This performance indicates usable accuracy across a vast linguistic spectrum compared with most commercial systems.

How does the amount of training audio affect Omnilingual ASR's error rate performance?

For languages that have at least ten hours of training audio, 95 % achieve a CER under 10 %, showing a strong correlation between data volume and accuracy. In contrast, languages with less than ten hours of audio see a much lower success rate.

What proportion of low‑resource languages (under ten hours of audio) meet the sub‑10 % character error rate threshold?

Only 36 % of the low‑resource languages—those with fewer than ten hours of training data—reach a CER below 10 %. This leaves a sizable majority of such languages still above the desired error rate.

What additional resource did Meta release alongside the Omnilingual ASR model to support research?

Meta also made publicly available the Omnilingual ASR Corpus, a large dataset of transcribed speech covering the same multilingual set. The corpus is intended to enable further academic study and real‑world applications of multilingual speech recognition.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

Meta's Omnilingual ASR hits sub‑10% error on 78% of 1,600 languages

Common Questions Answered

What character error rate does Meta's Omnilingual ASR achieve for the majority of the 1,600 languages it was tested on?

How does the amount of training audio affect Omnilingual ASR's error rate performance?

What proportion of low‑resource languages (under ten hours of audio) meet the sub‑10 % character error rate threshold?

What additional resource did Meta release alongside the Omnilingual ASR model to support research?

Most Popular

Rob Pike’s AI‑generated ‘act of kindness’ spams draft tribute to his work

Meta adds Spotify AI music, Kannada/Telugu, and noise filtering to AI Glasses

Fusion reactors could produce dark‑sector particles via neutron emissions

Gemini 3 Flash Offers Fast Multimodal Reasoning for Video, Data, Visual Q&A

NeuroPixel.AI draws global brands with production‑ready design automation tools

Qwen‑Image‑2512 launches, rivals Google’s Nano Banana Pro in AI image generation

OpenAI Opens Submissions for Apps Using ChatGPT’s SDK, Unveiled at DevDay

OpenAI launches App Directory, accepts ChatGPT apps with privacy notices

Sora 2 Generates Disturbing AI Kid Videos as Legal Grey Area Persists

72% of US teens surveyed have used AI companions, Common Sense Media finds

Related Reading

Consensus uses GPT-5 and Responses API to speed scientific research

Developers say Sora, unlike Vine/TikTok, is not about people in social media

Google AI Advisors Let Users Probe Performance with Conversational “Why” Queries

Meta's Free Transformer decides review sentiment up front, then writes

Yann LeCun to leave Meta, launch AI startup, distanced from Llama models

Common Questions Answered

What character error rate does Meta's Omnilingual ASR achieve for the majority of the 1,600 languages it was tested on?

How does the amount of training audio affect Omnilingual ASR's error rate performance?

What proportion of low‑resource languages (under ten hours of audio) meet the sub‑10 % character error rate threshold?

What additional resource did Meta release alongside the Omnilingual ASR model to support research?

Most Popular

Rob Pike’s AI‑generated ‘act of kindness’ spams draft tribute to his work

Meta adds Spotify AI music, Kannada/Telugu, and noise filtering to AI Glasses

Fusion reactors could produce dark‑sector particles via neutron emissions

Gemini 3 Flash Offers Fast Multimodal Reasoning for Video, Data, Visual Q&A

NeuroPixel.AI draws global brands with production‑ready design automation tools

Qwen‑Image‑2512 launches, rivals Google’s Nano Banana Pro in AI image generation

OpenAI Opens Submissions for Apps Using ChatGPT’s SDK, Unveiled at DevDay

OpenAI launches App Directory, accepts ChatGPT apps with privacy notices

Sora 2 Generates Disturbing AI Kid Videos as Legal Grey Area Persists

72% of US teens surveyed have used AI companions, Common Sense Media finds

What proportion of low‑resource languages (under ten hours of audio) meet the sub‑10 % character error rate threshold?