Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning
Abdullah, Abdulhady Abas, Karim, Sarkhel H. Taher, Ahmed, Sara Azad, Tariq, Kanar R., Rashid, Tarik A.
–arXiv.org Artificial Intelligence
Speaker diarization, a core problem in speech processing, entails partitioning a given audio stream according to the speakers. Even though progress has been made in the development of the models for high - resource languages, there is still a set of specific difficulties in going through a similar process for low - resource languages such as Kurdish: there are very few annotated datasets available; the language has dialects; speakers use code - switching a lot. These challenges are met in this study by training the Wav2V ec 2.0 SSL model on a Ku rdish dataset prepared for this purpose. Thanks to transfer learning, it was possible to transfer multiling ual representations learnt in other languages to the phonetic and acoustic features of Kurdish speech. The general Diarization Error Rate (DER) was reduced by 7.2%, and the cluster purity increased by 13% when compared to the baseline algorithm. They show that making improvements in any state - of - the - art model can help in enhancing the performance of under - resourced languages. Implications of this work include transcription services for Kurdish - language media programs, as well as speaker segmentation in multilingual call centers, teleconferencing, and videoconferencing systems. Therefore, this work demonstrates that self - supervised and transfer techniques can improve speaker diarization for Kurdish and other low - resource languages with diverse features. The approach provides a ba se for building effective diarization systems in other understudied languages, which remai ns essential for speech technology's equity.
arXiv.org Artificial Intelligence
Apr-29-2025
- Country:
- Asia
- China > Guangdong Province
- Shenzhen (0.04)
- Middle East > Iraq
- Erbil Governorate > Erbil (0.04)
- Halabja Governorate > Halabja (0.04)
- Kurdistan Region > Sulaymaniyah Governorate (0.04)
- China > Guangdong Province
- Asia
- Genre:
- Research Report > New Finding (0.88)
- Industry:
- Education (0.67)
- Health & Medicine (0.68)
- Media (0.48)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Inductive Learning (0.68)
- Neural Networks > Deep Learning (1.00)
- Natural Language (1.00)
- Speech > Speech Recognition (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence