Distinguishing Repetition Disfluency from Morphological Reduplication in Bangla ASR Transcripts: A Novel Corpus and Benchmarking Analysis
Arpa, Zaara Zabeen, Apurbo, Sadnam Sakib, Oishee, Nazia Karim Khan, Abrar, Ajwad
–arXiv.org Artificial Intelligence
Automatic Speech Recognition (ASR) is now integral to digital interaction, driving applications from virtual assistants to automated subtitles on platforms like YouTube ( Dykes et al. 2023). Despite its wide adoption and significant advancements, ASR performance remains imperfect, particularly in Large Vocabulary Continuous Speech Recognition (L VCSR) and under real-world conditions like background noise or diverse accents ( Errattahi et al. 2018; Romana et al. 2024). This often results in a significant Word Error Rate (WER). A major, persistent source of error stems from speech disfluencies, which are interruptions in the smooth flow of speech, including filled pauses, hesitations, self-corrections, and most relevantly, repetitions ( Romana et al. 2024). Disfluencies are natural and frequent; one study found a 50% probability of a disfluency in a 10-13 word sentence ( Jamshid Lou and Johnson 2020b). Their presence creates noisy transcripts that are difficult to read and detrimental to downstream Natural Language Processing (NLP) tasks such as machine translation or information extraction ( Romana et al. 2024; Jamshid Lou and Johnson 2020a).
arXiv.org Artificial Intelligence
Nov-18-2025
- Country:
- North America > United States (0.46)
- Asia > Middle East
- UAE (0.28)
- Genre:
- Research Report (0.84)
- Technology: