Akinjobi, Zainab
AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages
Wang, Jiayi, Adelani, David Ifeoluwa, Agrawal, Sweta, Rei, Ricardo, Briakou, Eleftheria, Carpuat, Marine, Masiak, Marek, He, Xuanli, Bourhim, Sofia, Bukula, Andiswa, Mohamed, Muhidin, Olatoye, Temitayo, Mokayede, Hamam, Mwase, Christine, Kimotho, Wangui, Yuehgoh, Foutse, Aremu, Anuoluwapo, Ojo, Jessica, Muhammad, Shamsuddeen Hassan, Osei, Salomey, Omotayo, Abdul-Hakeem, Chukwuneke, Chiamaka, Ogayo, Perez, Hourrane, Oumaima, Anigri, Salma El, Ndolela, Lolwethu, Mangwana, Thabiso, Mohamed, Shafie Abdi, Hassan, Ayinde, Awoyomi, Oluwabusayo Olufunke, Alkhaled, Lama, Al-Azzawi, Sana, Etori, Naome A., Ochieng, Millicent, Siro, Clemencia, Njoroge, Samuel, Muchiri, Eric, Kimotho, Wangari, Momo, Lyse Naomi Wamba, Abolade, Daud, Ajao, Simbiat, Adewumi, Tosin, Shode, Iyanuoluwa, Macharm, Ricky, Iro, Ruqayya Nasir, Abdullahi, Saheed S., Moore, Stephen E., Opoku, Bernard, Akinjobi, Zainab, Afolabi, Abeeb, Obiefuna, Nnaemeka, Ogbu, Onyekachi Raphael, Brian, Sam, Otiende, Verrah Akinyi, Mbonu, Chinedu Emmanuel, Sari, Sakayo Toadoum, Stenetorp, Pontus
Despite the progress we have recorded in scaling multilingual machine translation (MT) models and evaluation data to several under-resourced African languages, it is difficult to measure accurately the progress we have made on these languages because evaluation is often performed on n-gram matching metrics like BLEU that often have worse correlation with human judgments. Embedding-based metrics such as COMET correlate better; however, lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with a simplified MQM guideline for error-span annotation and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET, a COMET evaluation metric for African languages by leveraging DA training data from high-resource languages and African-centric multilingual encoder (AfroXLM-Roberta) to create the state-of-the-art evaluation metric for African languages MT with respect to Spearman-rank correlation with human judgments (+0.406).
Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages
Leong, Colin, Shandilya, Herumb, Dossou, Bonaventure F. P., Tonja, Atnafu Lambebo, Mathew, Joel, Omotayo, Abdul-Hakeem, Yousuf, Oreen, Akinjobi, Zainab, Emezue, Chris Chinenye, Muhammad, Shamsudeen, Kolawole, Steven, Choi, Younwoo, Adewumi, Tosin
Many natural language processing (NLP) tasks make use of massively pre-trained language models, which are computationally expensive. However, access to high computational resources added to the issue of data scarcity of African languages constitutes a real barrier to research experiments on these languages. In this work, we explore the applicability of low-compute approaches such as language adapters in the context of this low-resource double-bind. We intend to answer the following question: do language adapters allow those who are doubly bound by data and compute to practically build useful models? Through fine-tuning experiments on African languages, we evaluate their effectiveness as cost-effective approaches to low-resource African NLP. Using solely free compute resources, our results show that language adapters achieve comparable performances to massive pre-trained language models which are heavy on computational resources. This opens the door to further experimentation and exploration on full-extent of language adapters capacities.