What India is doing, will do, and should do—to not just survive but thrive in the chaos unleashed by Trump Subscribe here
Good morning [%first_name |Dear Reader%],
Last week, I struck up a conversation with ChatGPT in my mother tongue, Punjabi, and it wasn’t great. Instead of an immersive experience I hoped to have, the bot yanked me back to reality.
For one, it made basic spelling and pluralisation errors and missed idiomatic meaning of certain phrases. Second, its responses were peppered with the kind of Punjabi you’d typically hear in a Bollywood flick—where Hindi words bleed into the Punjabi.
| A 100-word paragraph that ChatGPT generated on the linguistic diversity in India, in English |
| ChatGPT’s attempt at translating the same paragraph to Punjabi produced many errors |
In fact, that’s the next big barrier for companies in the field seeking to make a mark in India right now: master the vernacular. From OpenAI to government-funded LLM initiative Bharatgen to the AI-research lab at IIT Madras, AI4India.
Just in the last three months, the sector saw three significant developments:
- In September, the government launched a beta version of Adi Vaani, christened “India’s first AI-powered translator for tribal languages”
- In November, photo-sharing app Instagram announced Meta AI voice translation for reels produced in five Indian languages—Bengali, Telugu, Tamil, Kannada, and Marathi
- In December, OpenAI rolled out a campaign promoting the use of ChatGPT in Indian languages. This followed its release of IndQA, a benchmark to evaluate AI models’ understanding of Indian languages
These developments suggest momentum, but the world of vernacular AI remains fragmented into two universes. One where datasets built by private players such as Pareto, Mercor, and Welocalize (which train models for the likes of Gemini, ChatGPT, and Perplexity) solve for efficiency, and are popular. On the other hand, government datasets like Bhashini address accuracy and nuances, but remain limited in their use.
These universes seldom cross over. Meaning, datasets remain fragmented, gaps between private and public corpora are wide, and complexities of each language add to the chaos for AI companies hoping to cater to millions of Indians for whom English isn’t even the first language.
That begs the question: who’s left out, even amid this attempt at inclusion?
Models, municipalities, and the missed majority
Earlier this year, the municipal corporation of Gurugram partnered with AI startup Sarvam for property-tax collection. “We used AI to remind people in Hindi to pay their tax,” said Aditya Mudgal, GTM, AI policy and government applications at Sarvam.

I enjoy reading The Ken because it is informative, the articles are well researched, well written, without the spin and bias. I admire The Ken team for their dedication to getting closer to the true picture.
Hari Buggana
Chairman and MD, InvAscent
Transparent, Honest, Detailed. To me, The Ken has been this since the day I subscribed to them. The research that they put into each story and the way it is presented is thoroughly interesting. Personally, I’ve always had a great time interacting with the publication and reading the stories.
Harshil Mathur
CEO and Co-Founder, Razorpay
The Ken has proven naysayers wrong by successfully running a digital news publication on a pure-subscription business model in India. They have shown that discerning readers are willing to pay for well-researched, well-written, in-dept news articles.
Kiran Mazumdar Shaw
Executive Chairperson, Biocon Limited
As a designer, it’s easy to get lost in the craft of building products. As a business owner however, keeping up with a rapidly changing landscape is key to saying relevant. The Ken doesn’t just help me stay on top of what’s happening in India(and beyond), but makes it fun to do so.
Rahul Gonsalves
Co-founder and CEO, Obvious Ventures