Google has significantly expanded its Translate platform by adding support for 110 new languages. This move, facilitated by the advanced capabilities of the PaLM 2 large language model (LLM), marks the largest expansion in the history of Google Translate. The new languages collectively represent over 614 million speakers, or approximately 8% of the world’s population.
Role of PaLM 2
The integration of the PaLM 2 AI model has been pivotal in this expansion. PaLM 2, which also underpins other Google products such as Bard (now known as Gemini), has enabled Google Translate to efficiently learn and support languages that are closely related to one another. This includes languages close to Hindi, such as Awadhi and Marwadi, as well as French creoles like Seychellois Creole and Mauritian Creole.
Read More: Google phases out continuous scrolling on search results
Significant Language Additions
Among the newly supported languages, Cantonese stands out as one of the most requested additions. Despite its widespread use, Cantonese posed a challenge due to its overlap with Mandarin in writing, complicating data collection and model training. Another notable addition is Punjabi (Shahmukhi), which is predominantly spoken in Pakistan.
Google has also added Manx, the Celtic language of the Isle of Man, which nearly went extinct in the 1970s but has seen a revival. Tok Pisin, the lingua franca of Papua New Guinea, is another significant inclusion. Due to its English-based creole structure, English speakers may find Tok Pisin translations particularly accessible.
Focus on Regional and Indigenous Languages
This expansion has also seen Google Translate’s largest inclusion of African languages to date, with about a quarter of the new languages coming from the continent. This includes languages such as Fon, Kikongo, Luo, Ga, Swati, Venda, and Wolof. The addition of these languages reflects Google’s commitment to supporting regional varieties, dialects, and different spelling standards.
Preservation Efforts
Google’s initiative also supports languages with revitalization efforts, even if they have few or no active speakers. For example, Manx has seen a resurgence due to community efforts, highlighting Google’s role in preserving linguistic diversity.
Read More: Google introduces AI-powered Gemini features in Gmail
This expansion is part of Google’s broader commitment to its 1,000 Languages Initiative, announced in 2022. The initiative aims to use AI models to support the world’s 1,000 most-spoken languages. As technology continues to advance and Google collaborates with linguists and native speakers, the company plans to add even more languages and dialects to Translate.