Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-based Multilingual Pre-training

Dong, Lukuan, Qin, Donghong, Bai, Fengbo, Song, Fanhua, Liu, Yan, Xu, Chen, Ou, Zhijian

Jul-18-2024–arXiv.org Artificial Intelligence

In our practice, it takes non-trivial efforts to collect and transcribe even less than 10 hours of Iu Mien language. The The mainstream automatic speech recognition (ASR) technology development of Iu Mien language speech recognition systems usually requires hundreds to thousands of hours of is very challenging, while it is very important to reduce digital annotated speech data. Three approaches to low-resourced divides and culture inheritance. ASR are phoneme or subword based supervised pre-training, The paradigm of pre-training (PT) followed by fine-tuning and self-supervised pre-training over multilingual data. The (FT), called the PTFT paradigm, has emerged in recent years as Iu Mien language is the main ethnic language of the Yao an effective way to solve the problem of limited training data for ethnic group in China and is low-resourced in the sense that low-resource languages for ASR. In pre-training, training data the annotated speech is very limited. With less than 10 hours for a number of languages are merged to train a multilingual of transcribed Iu Mien language, this paper investigates and model. The pre-trained model can then serve as a backbone, compares the three approaches for Iu Mien speech recognition.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Jul-18-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.49)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Speech > Speech Recognition (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found