Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Nowakowski, Karol, Ptaszynski, Michal, Murasaki, Kyoko, Nieuważny, Jagna
–arXiv.org Artificial Intelligence
In recent years, neural models learned through self-supervised pretraining on large scale multilingual text or speech data have exhibited promising results for underresourced languages, especially when a relatively large amount of data from related language(s) is available. While the technology has a potential for facilitating tasks carried out in language documentation projects, such as speech transcription, pretraining a multilingual model from scratch for every new language would be highly impractical. We investigate the possibility for adapting an existing multilingual wav2vec 2.0 model for a new language, focusing on actual fieldwork data from a critically endangered tongue: Ainu. Specifically, we (i) examine the feasibility of leveraging data from similar languages also in fine-tuning; (ii) verify whether the model's performance can be improved by further pretraining on target language data. Our results show that continued pretraining is the most effective method to adapt a wav2vec 2.0 model for a new language and leads to considerable reduction in error rates. Furthermore, we find that if a model pretrained on a related speech variety or an unrelated language with similar phonological characteristics is available, multilingual fine-tuning using additional data from that language can have positive impact on speech recognition performance when there is very little labeled data in the target language.
arXiv.org Artificial Intelligence
Jan-17-2023
- Country:
- Africa (0.04)
- Oceania > Australia
- North America > United States
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- Europe
- Austria (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Asia
- Middle East > Oman (0.04)
- Russia > Far Eastern Federal District
- Sakhalin Oblast > Sakhalin Island (0.09)
- Japan
- Honshū
- Tōhoku > Yamagata Prefecture
- Yamagata (0.04)
- Kantō
- Tokyo Metropolis Prefecture > Tokyo (0.05)
- Kanagawa Prefecture > Yokohama (0.04)
- Tōhoku > Yamagata Prefecture
- Hokkaidō > Hokkaidō Prefecture
- Sapporo (0.04)
- Honshū
- Genre:
- Research Report > New Finding (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Speech > Speech Recognition (1.00)
- Natural Language (1.00)
- Machine Learning (1.00)
- Information Technology > Artificial Intelligence