Doumbouya, Moussa Koulako Bala
CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition
Bartelds, Martijn, Nandi, Ananjan, Doumbouya, Moussa Koulako Bala, Jurafsky, Dan, Hashimoto, Tatsunori, Livescu, Karen
Modern deep learning models often achieve high overall performance, but consistently fail on specific subgroups. Group distributionally robust optimization (group DRO) addresses this problem by minimizing the worst-group loss, but it fails when group losses misrepresent performance differences between groups. This is common in domains like speech, where the widely used connectionist temporal classification (CTC) loss scales with input length and varies with linguistic and acoustic properties, leading to spurious differences between group losses. We present CTC-DRO, which addresses the shortcomings of the group DRO objective by smoothing the group weight update to prevent overemphasis on consistently high-loss groups, while using input length-matched batching to mitigate CTC's scaling issues. We evaluate CTC-DRO on the task of multilingual automatic speech recognition (ASR) across five language sets from the ML-SUPERB 2.0 benchmark. CTC-DRO consistently outperforms group DRO and CTC-based baseline models, reducing the worst-language error by up to 65.9% and the average error by up to 47.7%. CTC-DRO can be applied to ASR with minimal computational costs, and offers the potential for reducing group disparities in other domains with similar challenges.
Machine Translation for Nko: Tools, Corpora and Baseline Results
Doumbouya, Moussa Koulako Bala, Diané, Baba Mamadi, Cissé, Solo Farabado, Diané, Djibrila, Sow, Abdoulaye, Doumbouya, Séré Moussa, Bangoura, Daouda, Bayo, Fodé Moriba, Condé, Ibrahima Sory 2., Diané, Kalo Mory, Piech, Chris, Manning, Christopher
Unfortunately, to over 40 million people across West African countries date, there isn't any usable machine translation including Mali, Guinea, Ivory Coast, Gambia, (MT) system for Nko, in part due to the unavailability Burkina Faso, Sierra Leone, Senegal, Liberia, and of large text corpora required by state-of-the-art Guinea-Bissau. Nko, which means'I say' in all neural machine translation (NMT) algorithms. Manding languages, was developed as both the Nko is a representative case study of the broader Manding literary standard language and a writing issues that interfere with the goal of universal machine system by Soulemana Kanté in 1949 for the translation. Thousands of languages still purpose of sustaining the strong oral tradition of don't have available or usable MT systems, mainly Manding languages (Niane, 1974; Conde, 2017; due to the unavailability of high-quality parallel Eberhard et al., 2023).