India officially recognizes 22 languages, but hundreds more are spoken across its villages and cities. Only 11 percent of India's 1.4 billion people speak English, and even Hindi reaches just 57 percent of the population. Hundreds of millions of citizens are locked out of digital services not because they lack smartphones or internet, but because the technology doesn't speak their language.
I saw this problem firsthand last year when helping my parents make a simple name change on the Bangalore Electricity Department website. The site was only available in English and Kannada. I happen to speak Kannada, so I could navigate it, but what struck me was how many couldn't. Bangalore is home to a large immigrant population from other states, and for those residents, even this routine administrative task becomes an insurmountable barrier.
If educated, urban residents struggle with basic accessibility simply because of language, the challenges for millions without such advantages are exponentially greater.
Walk into any government office in India, and you'll find a familiar scene: forms in English or Hindi, computers displaying text most visitors can't read, citizens standing in line for hours just to find out if they're eligible for life-changing programs.
Bhashini: Infrastructure for Multilingual AI
Only last week, the state government of Madhya Pradesh signed a partnership with Digital India’s Bhashini Division (DIBD) to integrate multilingual AI tools across the state's digital governance platforms. The agreement, formalized at a regional AI conference in the capital city Bhopal, aims to enable citizens to interact with public services in their own languages rather than defaulting to English or Hindi.
While most mainstream AI systems are trained primarily on English and optimized for English-speaking contexts, the Bhashini division is building infrastructure specifically designed for linguistic diversity - treating language access not as an afterthought but as foundational to building inclusive AI. This is part of a broader movement in India around building Digital Public Infrastructure (DPI) shaped by openness, accessibility, and inclusion.
Bhashini, short for "BHASHa INterface for India”, was launched by Prime Minister Modi in 2022. It is a platform that treats language access as public infrastructure exposed through standardized APIs, combining translation across all 22 constitutionally recognized Indian languages, speech recognition, and text-to-speech tools that developers, governments, and nonprofits can use freely.

Crowdsourced Translations: BhashaDaan
Modern AI learns from vast quantities of data. But for many Indian languages, such as Konkani and Bodo, that data does not exist digitally. BhashaDaan means "Language Donation."
Rather than waiting for data to accumulate online, the BhashaDaan initiative by the Government of India invites everyday speakers to contribute voice recordings, translations, and texts of their native languages.
Thousands of contributors have donated their voices, creating datasets that capture living language, including dialects, colloquialisms, and regional variations, which makes the resulting solution not just technically functional but also practically relevant and useful.
Contributors can participate in multiple ways: recording their voice reading pre-scripted sentences (Bolo India), transcribing audio they hear (Suno India), translating text between languages (Likho India), or labeling images (Dekho India). Each contribution method is designed to be accessible and requires only a smartphone and a few minutes.

Screenshots from the BhashaDaan Platform where users can contribute to improving translation. Above, is the "Likho" activity where users translate text between languages. Below is the "Dekho" activity, where users must read text from an image and transcribe it.

But turning those contributions into working systems requires two additional pieces of infrastructure.
1) AIKosh, India's national AI datasets repository, curates and hosts the raw materials like Samanantar, a massive parallel corpus that pairs Indian languages with English for training translation models.
2) Model Vatika builds on this foundation, offering ready-to-use AI models that developers can plug into their applications without building language capabilities from scratch.
This pipeline transforms voice donations into publicly available digital infrastructure to expand linguistic diversity.
Where It Matters: Real Impact
According to the Government of India, thousands of village councils across the country now use the SabhaSaar platform, which now operates multilingually thanks to the Bhashini Division. Local officials record meeting minutes in native languages, and the system automatically transcribes and translates them. What once required manual translation or wasn't documented at all now happens seamlessly.
Another example is Jugalbandi, a WhatsApp chatbot that uses Bhashini's infrastructure, along with conversational AI, to tackle urgent public problems. Developed by OpenNyAI, a nonprofit collaborative focused on using AI to increase access to justice, Jugalbandi is an example of how translation tools made freely available through Bhashini enable organizations to build sophisticated applications serving vulnerable populations.
But the test of any technology is how it changes lives.
Consider its use for domestic violence survivors. A woman seeking protection can send a voice message in her native language to ask about her legal options and receive a response in the same language that explains her rights, available protections, and how to access them without navigating English-language legal websites or revealing her situation to potentially unsympathetic officials.
Similarly, OpenNyAI has tested Jugalbandi with sanitation workers in Bangalore to help them understand labor rights and access welfare programs, as well as with farmers who might have questions about accessing agricultural subsidies or programs.
For vulnerable populations facing both linguistic and systemic barriers, this clarity in their native language can be transformative.
And crucially, because the underlying language infrastructure is open and free, any organization with relevant expertise can build similar tools for their communities' specific needs.
The Future Is Multilingual (If We Choose It)
The dominant AI narrative has been about racing toward more powerful models, about benchmark breakthroughs, and about safely controlling advanced capabilities. These aren't unimportant questions. But they've been overshadowed by an equally critical one about inclusion.
At The GovLab, we co-designed an AI-powered solution called AIEP to help parents of children with special needs summarize and translate complex Individual Education Programs (IEPs), which, among other things, contain information about services their child is eligible to receive from the school district. Several parents were seeing this critical information about their children’s education in a language they understand for the very first time.
Well-designed, multilingual AI tools can have a massive positive impact on underserved communities, but that requires transparent auditing and validation of translation accuracy. In the case of Bhashini, without rigorous, publicly available performance metrics broken down by language, dialect, and use case, it will be challenging to measure whether the system actually works for the people who need it most or whether it introduces new harms through mistranslations in high-stakes contexts like welfare eligibility or legal rights.
The risk remains that certain languages, such as Hindi, will comprise a vast majority of the crowdsourced content, while other regional languages will be underrepresented in the training data. This imbalance could reinforce existing inequities, where speakers of dominant languages receive high-quality service while others are left with unreliable tools.
The real challenge is ensuring that the “support for 35+ languages," which the Government of India claims, doesn't mask a reality where a few languages work well while many others remain effectively unusable for critical applications.
Language is the most basic form of recognition.
India's platforms aren't perfect, and they're not finished. But they represent something the field desperately needs: a commitment to designing inclusive AI.
Thumbnail Image for this post was created using Google's Nano Banana Pro