SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?
Li, Senyu, Wang, Jiayi, Ali, Felermino D. M. A., Cherry, Colin, Deutsch, Daniel, Briakou, Eleftheria, Sousa-Silva, Rui, Cardoso, Henrique Lopes, Stenetorp, Pontus, Adelani, David Ifeoluwa
–arXiv.org Artificial Intelligence
Evaluating machine translation (MT) quality for under-resourced African languages remains a significant challenge, as existing metrics often suffer from limited language coverage and poor performance in low-resource settings. While recent efforts, such as AfriCOMET, have addressed some of the issues, they are still constrained by small evaluation sets, a lack of publicly available training data tailored to African languages, and inconsistent performance in extremely low-resource scenarios. In this work, we introduce SSA-MTE, a large-scale human-annotated MT evaluation (MTE) dataset covering 14 African language pairs from the News domain, with over 73,000 sentence-level annotations from a diverse set of MT systems. Based on this data, we develop SSA-COMET and SSA-COMET-QE, improved reference-based and reference-free evaluation metrics. We also benchmark prompting-based approaches using state-of-the-art LLMs like GPT-4o, Claude-3.7 and Gemini 2.5 Pro. Our experimental results show that SSA-COMET models significantly outperform AfriCOMET and are competitive with the strongest LLM Gemini 2.5 Pro evaluated in our study, particularly on low-resource languages such as Twi, Luo, and Yoruba. All resources are released under open licenses to support future research.
arXiv.org Artificial Intelligence
Oct-7-2025
- Country:
- Africa
- Angola (0.05)
- Guinea-Bissau (0.04)
- Mozambique (0.04)
- The Gambia (0.04)
- Asia
- Middle East
- Israel (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Singapore (0.04)
- Middle East
- Europe
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- United Kingdom > Wales (0.05)
- France > Provence-Alpes-Côte d'Azur
- North America
- Canada
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Florida > Miami-Dade County
- Africa
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology (0.67)
- Technology: