Ige, Ifeoluwatayo A.
Mitigating Translationese in Low-resource Languages: The Storyboard Approach
Kuwanto, Garry, Urua, Eno-Abasi E., Amuok, Priscilla Amondi, Muhammad, Shamsuddeen Hassan, Aremu, Anuoluwapo, Otiende, Verrah, Nanyanga, Loice Emma, Nyoike, Teresiah W., Akpan, Aniefon D., Udouboh, Nsima Ab, Archibong, Idongesit Udeme, Moses, Idara Effiong, Ige, Ifeoluwatayo A., Ajibade, Benjamin, Awokoya, Olumide Benjamin, Abdulmumin, Idris, Aliyu, Saminu Mohammad, Iro, Ruqayya Nasir, Ahmad, Ibrahim Said, Smith, Deontae, Michaels, Praise-EL, Adelani, David Ifeoluwa, Wijaya, Derry Tanti, Andy, Anietie
Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.
The African Stopwords project: curating stopwords for African languages
Emezue, Chris, Nigatu, Hellina, Thinwa, Cynthia, Zhou, Helper, Muhammad, Shamsuddeen, Louis, Lerato, Abdulmumin, Idris, Oyerinde, Samuel, Ajibade, Benjamin, Samuel, Olanrewaju, Joshua, Oviawe, Onwuegbuzia, Emeka, Emezue, Handel, Ige, Ifeoluwatayo A., Tonja, Atnafu Lambebo, Chukwuneke, Chiamaka, Dossou, Bonaventure F. P., Etori, Naome A., Emmanuel, Mbonu Chinedu, Yousuf, Oreen, Aina, Kaosarat, David, Davis
Stopwords are fundamental in Natural Language Processing (NLP) techniques for information retrieval. One of the common tasks in preprocessing of text data is the removal of stopwords. Currently, while high-resource languages like English benefit from the availability of several stopwords, low-resource languages, such as those found in the African continent, have none that are standardized and available for use in NLP packages. Stopwords in the context of African languages are understudied and can reveal information about the crossover between languages. The African Stopwords project aims to study and curate stopwords for African languages. When analysing text data and building various NLP models, stopwords might not add much value to the meaning of the document (Singh, 2019) depending on the NLP task (like text classification).