DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction
Im, Solee, Lee, Wonjun, An, Jinmyeong, Kim, Yunsu, Ok, Jungseul, Lee, Gary Geunbae
–arXiv.org Artificial Intelligence
We present DeRAGEC, a method for improving Named Entity (NE) correction in Automatic Speech Recognition (ASR) systems. By extending the Retrieval-Augmented Generative Error Correction (RAGEC) framework, DeRAGEC employs synthetic denoising rationales to filter out noisy NE candidates before correction. By leveraging phonetic similarity and augmented definitions, it refines noisy retrieved NEs using in-context learning, requiring no additional training. Experimental results on CommonVoice and STOP datasets show significant improvements in Word Error Rate (WER) and NE hit ratio, outperforming baseline ASR and RAGEC methods. Specifically, we achieved a 28% relative reduction in WER compared to ASR without postprocessing. Our source code is publicly available at: https://github.com/solee0022/deragec
arXiv.org Artificial Intelligence
Jun-10-2025
- Country:
- Asia > Japan
- Honshū > Kansai
- Osaka Prefecture > Osaka (0.04)
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Kansai
- North America > United States
- California > Santa Clara County > Los Gatos (0.04)
- Asia > Japan
- Genre:
- Research Report (0.82)
- Technology: