Goto

Collaborating Authors

 language disparity


CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition

arXiv.org Artificial Intelligence

Modern deep learning models often achieve high overall performance, but consistently fail on specific subgroups. Group distributionally robust optimization (group DRO) addresses this problem by minimizing the worst-group loss, but it fails when group losses misrepresent performance differences between groups. This is common in domains like speech, where the widely used connectionist temporal classification (CTC) loss scales with input length and varies with linguistic and acoustic properties, leading to spurious differences between group losses. We present CTC-DRO, which addresses the shortcomings of the group DRO objective by smoothing the group weight update to prevent overemphasis on consistently high-loss groups, while using input length-matched batching to mitigate CTC's scaling issues. We evaluate CTC-DRO on the task of multilingual automatic speech recognition (ASR) across five language sets from the ML-SUPERB 2.0 benchmark. CTC-DRO consistently outperforms group DRO and CTC-based baseline models, reducing the worst-language error by up to 65.9% and the average error by up to 47.7%. CTC-DRO can be applied to ASR with minimal computational costs, and offers the potential for reducing group disparities in other domains with similar challenges.


New Trends in NLP Research

#artificialintelligence

Natural language processing (NLP) is a field that uses text data and runs computations to gain insights and build predictive systems. Vast amounts of text data are available in written manuscripts and more so online on the web. These data sources have been used in the research community and industries to solve meaningful problems such as predicting the sentiment in a user comment, question answering, and fact-checking. Recent machine learning algorithms have enabled superhuman performance in a wide range of NLP tasks[1]. In this article, we summarize the recent machine learning research trends in NLP which have not only led to a plethora of breakthroughs but also resulted in a growing interest in this field of research. A big chunk of the breakthroughs can be attributed to the large language models that are built using neurons and trained using backpropagation.