Iro, Ruqayya Nasir
Mitigating Translationese in Low-resource Languages: The Storyboard Approach
Kuwanto, Garry, Urua, Eno-Abasi E., Amuok, Priscilla Amondi, Muhammad, Shamsuddeen Hassan, Aremu, Anuoluwapo, Otiende, Verrah, Nanyanga, Loice Emma, Nyoike, Teresiah W., Akpan, Aniefon D., Udouboh, Nsima Ab, Archibong, Idongesit Udeme, Moses, Idara Effiong, Ige, Ifeoluwatayo A., Ajibade, Benjamin, Awokoya, Olumide Benjamin, Abdulmumin, Idris, Aliyu, Saminu Mohammad, Iro, Ruqayya Nasir, Ahmad, Ibrahim Said, Smith, Deontae, Michaels, Praise-EL, Adelani, David Ifeoluwa, Wijaya, Derry Tanti, Andy, Anietie
Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.
AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages
Wang, Jiayi, Adelani, David Ifeoluwa, Agrawal, Sweta, Rei, Ricardo, Briakou, Eleftheria, Carpuat, Marine, Masiak, Marek, He, Xuanli, Bourhim, Sofia, Bukula, Andiswa, Mohamed, Muhidin, Olatoye, Temitayo, Mokayede, Hamam, Mwase, Christine, Kimotho, Wangui, Yuehgoh, Foutse, Aremu, Anuoluwapo, Ojo, Jessica, Muhammad, Shamsuddeen Hassan, Osei, Salomey, Omotayo, Abdul-Hakeem, Chukwuneke, Chiamaka, Ogayo, Perez, Hourrane, Oumaima, Anigri, Salma El, Ndolela, Lolwethu, Mangwana, Thabiso, Mohamed, Shafie Abdi, Hassan, Ayinde, Awoyomi, Oluwabusayo Olufunke, Alkhaled, Lama, Al-Azzawi, Sana, Etori, Naome A., Ochieng, Millicent, Siro, Clemencia, Njoroge, Samuel, Muchiri, Eric, Kimotho, Wangari, Momo, Lyse Naomi Wamba, Abolade, Daud, Ajao, Simbiat, Adewumi, Tosin, Shode, Iyanuoluwa, Macharm, Ricky, Iro, Ruqayya Nasir, Abdullahi, Saheed S., Moore, Stephen E., Opoku, Bernard, Akinjobi, Zainab, Afolabi, Abeeb, Obiefuna, Nnaemeka, Ogbu, Onyekachi Raphael, Brian, Sam, Otiende, Verrah Akinyi, Mbonu, Chinedu Emmanuel, Sari, Sakayo Toadoum, Stenetorp, Pontus
Despite the progress we have recorded in scaling multilingual machine translation (MT) models and evaluation data to several under-resourced African languages, it is difficult to measure accurately the progress we have made on these languages because evaluation is often performed on n-gram matching metrics like BLEU that often have worse correlation with human judgments. Embedding-based metrics such as COMET correlate better; however, lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with a simplified MQM guideline for error-span annotation and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET, a COMET evaluation metric for African languages by leveraging DA training data from high-resource languages and African-centric multilingual encoder (AfroXLM-Roberta) to create the state-of-the-art evaluation metric for African languages MT with respect to Spearman-rank correlation with human judgments (+0.406).
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages
Ogundepo, Odunayo, Gwadabe, Tajuddeen R., Rivera, Clara E., Clark, Jonathan H., Ruder, Sebastian, Adelani, David Ifeoluwa, Dossou, Bonaventure F. P., DIOP, Abdou Aziz, Sikasote, Claytone, Hacheme, Gilles, Buzaaba, Happy, Ezeani, Ignatius, Mabuya, Rooweither, Osei, Salomey, Emezue, Chris, Kahira, Albert Njoroge, Muhammad, Shamsuddeen H., Oladipo, Akintunde, Owodunni, Abraham Toluwase, Tonja, Atnafu Lambebo, Shode, Iyanuoluwa, Asai, Akari, Ajayi, Tunde Oluwaseyi, Siro, Clemencia, Arthur, Steven, Adeyemi, Mofetoluwa, Ahia, Orevaoghene, Aremu, Anuoluwapo, Awosan, Oyinkansola, Chukwuneke, Chiamaka, Opoku, Bernard, Ayodele, Awokoya, Otiende, Verrah, Mwase, Christine, Sinkala, Boyd, Rubungo, Andre Niyongabo, Ajisafe, Daniel A., Onwuegbuzia, Emeka Felix, Mbow, Habib, Niyomutabazi, Emile, Mukonde, Eunice, Lawan, Falalu Ibrahim, Ahmad, Ibrahim Said, Alabi, Jesujoba O., Namukombo, Martin, Chinedu, Mbonu, Phiri, Mofya, Putini, Neo, Mngoma, Ndumiso, Amuok, Priscilla A., Iro, Ruqayya Nasir, Adhiambo, Sonia
African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create AfriQA, the first cross-lingual QA dataset with a focus on African languages. AfriQA includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, AfriQA focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, AfriQA proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.