Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Hu, Shujie, Xie, Xurong, Jin, Zengrui, Geng, Mengzhe, Wang, Yi, Cui, Mingyu, Deng, Jiajun, Liu, Xunying, Meng, Helen
–arXiv.org Artificial Intelligence
The associated neural speech representations produced by these pre-trained Automatic recognition of disordered and elderly speech remains ASR systems are also inherently robust to domain mismatch [24-a highly challenging task to date due to the difficulty in collecting 26]. Although they have been successfully applied to a range of normal such data in large quantities. This paper explores a series of speech processing tasks including speech recognition [21-23, approaches to integrate domain adapted Self-Supervised Learning 27], speech emotion recognition [28] and speaker recognition [29], (SSL) pre-trained models into TDNN and Conformer ASR systems very limited researches on SSL pre-trained models for disordered for dysarthric and elderly speech recognition: a) input feature and elderly speech have been conducted [24, 30, 31]. Among these, fusion between standard acoustic frontends and domain adapted wav2vec2.0
arXiv.org Artificial Intelligence
Jun-22-2023
- Genre:
- Research Report (0.40)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.69)
- Technology: