Two AI (Artificial Intelligence) PhD students win Mozilla’s Common Voice competition

Preben Vangberg and Leena Farhat, who are both studying for their PhD, funded by the UKRI (UK Research and Innovation) funded Centre for Doctoral Training (), win Mozilla’s Common Voice, Our Voices, Diversity Model and Methods Competition.

Preben Vangberg and Leena Farhat are two students studying for their PhDs in the School of Computer Science and Electronic Engineering, at ��ϲʹ��. Their PhDs are funded by the Artificial Intelligence, Machine Learning and Advanced Computing (AIMLAC) centre for doctoral training (CDT). AIMLAC is funded by UKRI. Their PhD thesis’ investigate big data and Artificial Intelligence. Preben’s work is a collaboration between Computer Science and the Language Technologies Unit at Canolfan Bedwyr (big language data), and Leena is jointly collaborating between Computer Science and the Ocean Sciences (big social science data).

Mozilla announced a global competition in June 2022. Their competition, , was designed to investigate bias within language models, especially text-to-speech. The focus is to further diversity and inclusivity within these models. Submissions were collected by Mozilla from around the world. , with one of the teams being Preben and Leena.

"It was an honour to win this competition, along with the other winners. Our model was focused on the minority language Romansh. There are two dialects of this Swiss language, Sursilvan and Vallader.” Preben went on to say, “We were delighted that the judges praised our work for its performance, and small error rate.”

Because this is a minority language there are limited resources, of audio recordings and text, to train the model. We used old Swiss newspapers from a publicly available high-quality clear text corpus.

Preben continued

Preben continued by saying

“The method works in several phases. First, we trained an acoustic model responsible for transforming the sounds into their textual representation. Then we trained a language model (a glorified N-gram model) to help aid the acoustic model and fix the spelling mistakes in the output. Thirdly, we train bespoke models for the various dialects of Romansch, while also emulating the effect of having good text data for the individual dialects but lacking any spoken data. Our models performed well, but that is only part of what made the project interesting. We showed that you can good speech to text models by using a language model trained on the target dialect while using an acoustic model designed for a different dialect.”

Jonathan Roberts said:

Professor Jonathan Roberts, lead of the CDT (Centre for Doctoral Training) funding for ��ϲʹ��, added

“It is a delight to see that Preben and Leena’s effort paid off. Their work demonstrates our ongoing interest in text analysis, AI, and collaboration between Computing and Language Technologies Unit at Canolfan Bedwyr. Indeed, before doing their PhDs, both Leena and Preben studied on our new MSc in Language Technologies. I look forward to seeing how this work develops in the future”.

������ϲʹ�������