Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis
Wang, Xintong, Shi, Mingqian, Wang, Ye
–arXiv.org Artificial Intelligence
Subsequently, Zhang et al. [1] adopted Mispronunciation Detection and Diagnosis (MDD) systems, an autoregressive model, the Recurrent Neural Network Transducer leveraging Automatic Speech Recognition (ASR), face two (RNN-T) [9], for MDD. This approach aims to capture main challenges in Mandarin Chinese: 1) The two-stage models the temporal dependence of mispronunciation patterns, showing create an information gap between the phoneme or tone classification better performance than Connectionist Temporal Classification stage and the MDD stage.
arXiv.org Artificial Intelligence
Jun-6-2024