Improving Accented Speech Recognition with Multi-Domain Training

Mar-14-2023–arXiv.org Artificial Intelligence

However, CFPB [12] 4:07 6132 9 Belgian they still lack generalization capability and are not robust to domain shifts like accent variations. In this work, we use Table 1. Statistics for the datasets (duration in hours) speech audio representing four different French accents to create fine-tuning datasets that improve the robustness of pre-trained ASR models. By incorporating various accents in it is possible to add noise to the training data, modify voice the training set, we obtain both in-domain and out-of-domain speed, or transform voice by manipulating the vocal-source improvements. Our numerical experiments show that we can and vocal-tract characteristics [4]. Other approaches include reduce error rates by up to 25% (relative) on African and applying speaker normalization or anonymization methods in Belgian accents compared to single-domain training while a reverse manner, for example using Vocal Tract Length Perturbation keeping a good performance on standard French.

artificial intelligence, machine learning, speech recognition, (14 more...)

arXiv.org Artificial Intelligence

Mar-14-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Quebec (0.04)
- Europe > France
  - Brittany > Ille-et-Vilaine > Rennes (0.04)
- Africa
  - Republic of the Congo (0.05)
  - Niger (0.05)
  - Gabon (0.05)
  - Democratic Republic of the Congo (0.05)
  - Chad (0.05)
  - Cameroon (0.05)

Genre:
- Research Report (0.50)

Industry:
- Transportation (0.33)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Speech > Speech Recognition (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found