24 June 2025

Could AI be used to bring endangered languages back from the brink?

| By Morgan Kenyon
Start the conversation
Cheng Yu smiling with colleagues while looking at laptops

Trellis Data research and development head Cheng Yu (centre) with senior account executive Scott Bailey and chief client officer Dr Cliff Seery. He says AI could lead the revival of dying languages across Australia and beyond. Photo: Thorson Photography.

Imagine taking the light rail down Northbourne Avenue on a busy afternoon, surrounded by fellow commuters.

It’d be fair to expect a few languages other than English to float around the carriage – Mandarin, Nepali, Vietnamese, or Punjabi probably wouldn’t be a surprise.

But what if you heard a conversation spoken in fluent Manx? How about Cornish, or Tok Pisin (New Guinea Pidgin), or Ngunnawal language? That might make your ears perk up, even if you had no idea what was being said.

Saving a language from extinction is no simple task, but there are actually a few spoken today that were once thought to be lost forever.

Trellis Data head of research and development Cheng Yu believes that purpose-built AI technology could be the key to restoring even more.

“One of the most famous examples of a revived language is Hebrew, which at one point had not been spoken natively for more than 1500 years,” Cheng says.

“Linguists spent decades using religious texts to extract whatever Hebrew they could, eventually breathing new life into the language between the late 19th and early 20th century.

“The next step was to pass their knowledge on to teachers, who used it to train a new generation of native modern Hebrew speakers.”

Nine to ten million people around the world speak Hebrew today and more than half of them consider it their native language.

READ MORE Translating the universe with Canberra’s world-leading tech

So yes, dead languages can absolutely be recovered, but to do it manually would cost huge amounts of time, money and resources.

Cheng believes projects like these create the perfect opportunity for purpose-trained AI large language models to show what they can do.

“Large language models are usually trained using a lot of data. But for languages that haven’t been written down, there simply isn’t enough data to use,” Cheng says.

“Even if the data were there, vocabulary would be dated and not suitable for communicating in our modern world.

“Our technology could take what we already know about a dead language’s rules and vocabulary, then use context and probability to create new, modern words and phrases.”

This would effectively bridge the gap, but Cheng stresses the results would still need careful oversight by linguists.

Trellis Data team including Tim McLaren, Cheng Yu, Rachel Gately and Michael Gately,

Trellis Data’s communications head Tim McLaren (left) with Cheng Yu, co-founders Rachel and Michael Gately, HR manager Merisha Percival and development operations head Nik Kumar. Photo: Thorson Photography.

Communications head Tim McLaren offers some insight into how those foundations are working so far.

“Both Indonesia and Papua New Guinea are home to dozens, if not hundreds, of spoken languages that are not recorded to a large extent,” he says.

“The technology we’re developing will help translate and transcribe them, again without huge amounts of data, allowing governments and core services to better provide for their communities.”

The same AI could be applied to transcribe conversations with industry-specific jargon, for example, acronyms used by military personnel or in hospitals.

Trellis Data has already built initial AI language models for three Indigenous languages, serving as a basis for future extensions.

“There’s great potential for AI to help revive Indigenous languages across Australia and in surrounding countries such as Indonesia or Papua New Guinea,” Cheng says.

“We absolutely recognise the need for engagement and approval by Indigenous elders before releasing this technology for public use.

“We look forward to working with them to not just preserve Indigenous languages, but enable their active use, as part of a future Australia that encompasses all the diversity of its current and ancient history.”

READ ALSO Here’s why your Uber driver is probably overqualified

According to Cheng, supporting endangered languages isn’t just about preserving their cultural importance.

“We estimate that hundreds of jobs will be created for every language going through the revival process; up to 1000 if they went as far as designing a course syllabus,” he says.

“Say we aim ambitiously to revive the top 100 Australian Indigenous languages. Based on how much data we already have and the demand to revive them, that would build up a brand new $17.5 billion industry for Australia.

“So purely from an economic standpoint, it’s in our interest to leverage these beautiful cultural heritages that have been here in Australia for 65,000 years.”

The Trellis team delivers leading-edge machine-learning technology on an easy-to-use platform. Find out more at Trellis Data.

REGION MEDIA PARTNER CONTENT

Start the conversation

Daily Digest

Want the best Canberra news delivered daily? Every day we package the most popular Region Canberra stories and send them straight to your inbox. Sign-up now for trusted local news that will never be behind a paywall.

By submitting your email address you are agreeing to Region Group's terms and conditions and privacy policy.