Exploring ASR-Based Wav2Vec2 for Automated Speech Disorder Assessment: Insights and Analysis
Nguyen, Tuan, Fredouille, Corinne, Ghio, Alain, Balaguer, Mathieu, Woisard, Virginie
–arXiv.org Artificial Intelligence
Some automatic systems have ASR-based model has been fine-tuned for automated speech shown robust performance and stability by learning from expert disorder quality assessment tasks, yielding impressive results decisions [6, 7]. and setting a new baseline for Head and Neck Cancer speech contexts. This demonstrates that the ASR dimension from In 2024, Nguyen et al. [8] introduced a system that Wav2Vec2 closely aligns with assessment dimensions. Despite leverages the Automatic Speech Recognition (ASR) based its effectiveness, this system remains a black box with Wav2Vec2 model [9], known for its strong capability in no clear interpretation of the connection between the model learning speech representations. This approach compared ASR dimension and clinical assessments. This paper presents self-supervised learning (SSL) and the ASR dimension for the first analysis of this baseline model for speech quality assessment, speech quality assessment. It is shown that the fine-tuning focusing on intelligibility and severity tasks.
arXiv.org Artificial Intelligence
Oct-10-2024
- Genre:
- Research Report (0.64)
- Industry:
- Health & Medicine > Therapeutic Area > Oncology > Head & Neck Cancer (0.70)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Natural Language (0.95)
- Speech > Speech Recognition (1.00)
- Information Technology > Artificial Intelligence