Improving Accented Speech Recognition with Multi-Domain Training
Maison, Lucas, Estève, Yannick
–arXiv.org Artificial Intelligence
However, CFPB [12] 4:07 6132 9 Belgian they still lack generalization capability and are not robust to domain shifts like accent variations. In this work, we use Table 1. Statistics for the datasets (duration in hours) speech audio representing four different French accents to create fine-tuning datasets that improve the robustness of pre-trained ASR models. By incorporating various accents in it is possible to add noise to the training data, modify voice the training set, we obtain both in-domain and out-of-domain speed, or transform voice by manipulating the vocal-source improvements. Our numerical experiments show that we can and vocal-tract characteristics [4]. Other approaches include reduce error rates by up to 25% (relative) on African and applying speaker normalization or anonymization methods in Belgian accents compared to single-domain training while a reverse manner, for example using Vocal Tract Length Perturbation keeping a good performance on standard French.
arXiv.org Artificial Intelligence
Mar-14-2023
- Country:
- Africa
- Cameroon (0.05)
- Chad (0.05)
- Democratic Republic of the Congo (0.05)
- Gabon (0.05)
- Niger (0.05)
- Republic of the Congo (0.05)
- Europe > France
- Brittany > Ille-et-Vilaine > Rennes (0.04)
- North America > Canada
- Quebec (0.04)
- Africa
- Genre:
- Research Report (0.50)
- Industry:
- Transportation (0.33)
- Technology: