Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings

Attia, Ahmed Adel, Demszky, Dorottya, Ogunremi, Tolulope, Liu, Jing, Espy-Wilson, Carol

May-15-2024–arXiv.org Artificial Intelligence

Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-based models by upwards of 10%. More specifically, CPT improves the model's robustness to different noises, microphones, classroom conditions as well as classroom demographics. Our CPT models show improved ability to generalize to different demographics unseen in the labeled finetuning data.

artificial intelligence, machine learning, wav2vec2, (15 more...)

arXiv.org Artificial Intelligence

May-15-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Japan (0.04)
  - Russia > Far Eastern Federal District
    - Sakhalin Oblast > Sakhalin Island (0.04)
- Europe > Denmark
  - Capital Region > Copenhagen (0.04)
- North America > United States
  - California > Santa Clara County
    - Palo Alto (0.04)
    - San Jose (0.04)
  - District of Columbia > Washington (0.04)
  - Maryland (0.04)
  - New York (0.04)
  - Ohio > Lake County
    - Eastlake (0.04)
  - Texas (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Education
  - Curriculum > Subject-Specific Education (0.64)
  - Educational Setting (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Speech > Speech Recognition (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found