Editorial illustration for G42 Launches NANDA 87B: Open-Source Hindi-English AI Model from MBZUAI
G42 Unveils NANDA 87B: Open-Source Hindi-English AI Model
G42 unveils open-source Hindi-English NANDA 87B, built on Llama-3.1 70B MBZUAI
The race to develop multilingual AI models just got more interesting. Tech giant G42 has stepped into the spotlight with NANDA 87B, an open-source language model targeting Hindi and English speakers.
Built through a strategic collaboration between Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Inception, and Cerebras, the model represents a significant leap in localized artificial intelligence. Its foundation on Llama-3.1 70B signals serious technical ambition.
The project highlights a growing trend: moving beyond English-centric AI development. By focusing specifically on Hindi - one of the world's most spoken languages - NANDA 87B could unlock new possibilities for millions of users who've been underserved by existing language models.
Researchers are betting big on linguistic diversity. With over 65 billion Hindi tokens powering its training, this model isn't just another translation tool - it's a potential game-changer for regional technological access and representation.
The model has been developed by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in collaboration with Inception, a G42 company, and chipmaker Cerebras. Built on Llama-3.1 70B, NANDA 87B has been trained on more than 65 billion Hindi tokens using a Hindi-centric tokeniser to improve efficiency in training and inference. "India deserves world-class technology that speaks its language. NANDA 87B is a major step in that direction," said Manu Jain, chief executive of G42 India, adding that the model is intended to support innovation across education, entertainment and enterprise use cases in India's AI ecosystem.
G42's NANDA 87B represents a significant leap for Hindi-English AI language models. The open-source model, developed by MBZUAI in partnership with Inception and Cerebras, specifically targets linguistic nuance by training on over 65 billion Hindi tokens.
Manu Jain's statement underscores the project's core mission: delivering world-class technology that authentically represents Indian linguistic complexity. By using a Hindi-centric tokenizer, the model aims to enhance both training and inference efficiency.
Built on the Llama-3.1 70B foundation, NANDA 87B signals a collaborative approach to AI development. The partnership between an academic institution, a technology company, and a chipmaker highlights the multidisciplinary nature of modern AI research.
For Indian technology and language communities, this model could represent a meaningful step toward more localized, contextually aware AI systems. Still, its real-world performance remains to be fullly tested.
The open-source nature of NANDA 87B suggests a commitment to transparency and collaborative technological advancement in the AI landscape.
Further Reading
- Microsoft-backed G42 scales up Nanda Hindi AI model to 87 billion parameters for India push - Moneycontrol
- Abu Dhabi's G42 Launches Largest Hindi Language AI Model, NANDA 87B - Outlook Business
- G42 Releases Open-Weight NANDA 87B Hindi–English Model Built on Llama - OpenSourceForU
- G42 unveils Nanda 87B upgrade to power Hindi-English language AI - YourStory
Common Questions Answered
How many tokens were used to train the NANDA 87B AI model?
The NANDA 87B model was trained on more than 65 billion Hindi tokens using a specialized Hindi-centric tokenizer. This approach aims to improve the model's efficiency in both training and inference processes for Hindi and English language processing.
What foundational model was used in developing NANDA 87B?
NANDA 87B is built on the Llama-3.1 70B foundational model, developed through a collaborative effort between Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Inception, and Cerebras. This strategic partnership highlights the model's technical ambition in creating a multilingual AI solution.
What is the primary goal of the NANDA 87B AI model?
The primary goal of NANDA 87B is to provide world-class technology that authentically represents Indian linguistic complexity, specifically targeting Hindi and English speakers. As stated by Manu Jain, the model aims to deliver a technological solution that truly speaks the language of India.