IndQA Targets India's Billion Non‑English Users, 2nd‑Largest ChatGPT Market
Why does a new benchmark matter now? While the tech is impressive, most of the Indian internet population still navigates in languages other than English. IndQA, the latest research effort from the team behind ChatGPT, zeroes in on that gap.
The model isn’t just another multilingual experiment; it’s built for a market that already ranks second in global ChatGPT usage. Here’s the thing: the country’s linguistic diversity spans 22 official tongues, with at least seven boasting over 50 million native speakers. By focusing on non‑English interactions, the project promises to reshape how millions of users engage with AI.
The rollout signals a concrete step toward broader accessibility, aligning product development with the realities of a billion‑strong user base. In short, the work underscores a commitment to serve Indian users where they actually converse.
India has about a billion people who don't use English as their primary language, 22 official languages (including at least seven with over 50 million speakers), and is ChatGPT's second largest market. This work is part of our ongoing commitment to improve our products and tools for Indian users, an
India has about a billion people who don't use English as their primary language, 22 official languages (including at least seven with over 50 million speakers), and is ChatGPT's second largest market. This work is part of our ongoing commitment to improve our products and tools for Indian users, and to make our technology more accessible throughout the country. IndQA evaluates knowledge and reasoning about Indian culture and everyday life in Indian languages.
It spans 2,278 questions across 12 languages and 10 cultural domains, created in partnership with 261 domain experts from across India. Unlike existing benchmarks like MMMLU and MGSM, it is designed to probe culturally nuanced, reasoning-heavy tasks that existing evaluations struggle to capture. IndQA covers a broad range of culturally relevant topics, such as Architecture & Design, Arts & Culture, Everyday Life, Food & Cuisine, History, Law & Ethics, Literature & Linguistics, Media & Entertainment, Religion & Spirituality, and Sports & Recreation--with items written natively in Bengali, English, Hindi, Hinglish, Kannada, Marathi, Odia, Telugu, Gujarati, Malayalam, Punjabi, and Tamil.
Note: We specifically added Hinglish given the prevalence of code-switching in conversations. Each datapoint includes a culturally grounded prompt in an Indian language, an English translation for auditability, rubric criteria for grading, and an ideal answer that reflects expert expectations.
Will IndQA close the gap? The initiative claims to push AGI toward truly multilingual usefulness. By acknowledging that roughly 80 percent of the global population speaks a language other than English, the team highlights a persistent blind spot in current evaluation suites.
Existing benchmarks such as MMMLU are described as saturated, with top models clustering near ceiling scores, making further progress hard to gauge. India, with about a billion non‑English primary speakers, 22 official languages and at least seven languages exceeding 50 million speakers, represents a substantial testbed; it’s also ChatGPT's second‑largest market. IndQA is presented as part of a broader commitment to improve products for these users.
Yet the article does not detail how the new metrics will differ from or overcome the limitations of MMMLU, nor does it explain how performance gains will be validated in real‑world settings. The effort remains promising, but whether it will translate into measurable advances for Indian users is still unclear.
Further Reading
- Introducing IndQA - OpenAI
 - Can Homegrown Indic Language AI Models Scale in 2025? - Techquity India
 - Google AI Mode Rolls Out In 7 New Indian Languages - NDTV
 - India Indigenous AI Model: Rise of Homegrown LLMs in 2025 - TechMitra
 - India's internet users to exceed 900 mn in 2025, driven by Indic languages - Business Standard
 
Common Questions Answered
What is IndQA and why was it created?
IndQA is a new benchmark developed by the team behind ChatGPT that evaluates knowledge and reasoning about Indian culture and everyday life in Indian languages. It was created to address the large gap of roughly one billion Indian internet users who primarily use non‑English languages.
How many official languages does India have and how many of them have over 50 million native speakers?
India has 22 official languages, and at least seven of those languages each have more than 50 million native speakers. This linguistic diversity underpins the need for multilingual AI tools like IndQA.
Why does the article describe India as ChatGPT's second‑largest market?
The article notes that India ranks second globally in ChatGPT usage, driven by its massive online population and high engagement despite most users preferring non‑English languages. This makes India a critical market for expanding multilingual AI capabilities.
What shortcomings of existing benchmarks such as MMMLU does IndQA aim to overcome?
Existing benchmarks like MMMLU are described as saturated, with top models achieving near‑ceiling scores that make further progress difficult to measure. IndQA seeks to provide a more challenging and culturally relevant evaluation for Indian languages, helping to gauge true multilingual usefulness.