Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition
Yang, Zhengdong, Liu, Qianying, Li, Sheng, Cheng, Fei, Chu, Chenhui
–arXiv.org Artificial Intelligence
We present a novel approach centered on the decoding stage of Automatic Speech Recognition (ASR) that enhances multilingual performance, especially for low-resource languages. It utilizes a cross-lingual embedding clustering method to construct a hierarchical Softmax (H-Softmax) decoder, which enables similar tokens across different languages to share similar decoder representations. It addresses the limitations of the previous Huffman-based H-Softmax method, which relied on shallow features in token similarity assessments. Through experiments on a downsampled dataset of 15 languages, we demonstrate the effectiveness of our approach in improving low-resource multilingual ASR accuracy.
arXiv.org Artificial Intelligence
Jan-29-2025
- Country:
- Asia
- Europe
- Austria > Styria
- Graz (0.04)
- Czechia > South Moravian Region
- Brno (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Portugal (0.04)
- United Kingdom > England
- East Sussex > Brighton (0.04)
- Austria > Styria
- North America
- Oceania > Australia
- Genre:
- Research Report
- New Finding (0.93)
- Promising Solution (0.66)
- Research Report
- Technology: