Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Shi, Jiatong, Chen, William, Berrebbi, Dan, Wang, Hsiu-Hsuan, Huang, Wei-Ping, Hu, En-Pei, Chuang, Ho-Lam, Chang, Xuankai, Tang, Yuxun, Li, Shang-Wen, Mohamed, Abdelrahman, Lee, Hung-yi, Watanabe, Shinji
–arXiv.org Artificial Intelligence
The benchmark primarily focuses on evaluating SSL models for automatic speech recognition (ASR) and language identification The 2023 Multilingual Speech Universal Performance Benchmark (LID). To cater to different use cases for SSL models, ML-SUPERB (ML-SUPERB) Challenge expands upon the acclaimed SUPERB includes two tracks with four different tasks: the monolingual framework, emphasizing self-supervised models in multilingual track (monolingual ASR) and the multilingual track (multilingual speech recognition and language identification. The challenge comprises ASR, LID, joint multilingual ASR/LID). Similar to SUPERB, MLa research track focused on applying ML-SUPERB to specific SUPERB utilizes frozen SSL models as feature extractors and multilingual subjects, a Challenge Track for model submissions, employs a lightweight downstream model that can be fine-tuned for and a New Language Track where language resource researchers different tracks to achieve high training efficiency. The released can contribute and evaluate their low-resource language data in the public benchmark of ML-SUPERB covers 143 languages, making it context of the latest progress in multilingual speech recognition.
arXiv.org Artificial Intelligence
Oct-9-2023