dossou
The AI Revolution Is Crushing Thousands of Languages
Recently, Bonaventure Dossou learned of an alarming tendency in a popular AI model. The program described Fon--a language spoken by Dossou's mother and millions of others in Benin and neighboring countries--as "a fictional language." This result, which I replicated, is not unusual. Dossou is accustomed to the feeling that his culture is unseen by technology that so easily serves other people. He grew up with no Wikipedia pages in Fon, and no translation programs to help him communicate with his mother in French, in which he is more fluent.
FonMTL: Towards Multitask Learning for the Fon Language
Dossou, Bonaventure F. P., Houndayi, Iffanice, Zantou, Pamely, Hacheme, Gilles
The Fon language, spoken by an average 2 million of people, is a truly low-resourced African language, with a limited online presence, and existing datasets (just to name but a few). Multitask learning is a learning paradigm that aims to improve the generalization capacity of a model by sharing knowledge across different but related tasks: this could be prevalent in very data-scarce scenarios. In this paper, we present the first explorative approach to multitask learning, for model capabilities enhancement in Natural Language Processing for the Fon language. Specifically, we explore the tasks of Named Entity Recognition (NER) and Part of Speech Tagging (POS) for Fon. We leverage two language model heads as encoders to build shared representations for the inputs, and we use linear layers blocks for classification relative to each task. Our results on the NER and POS tasks for Fon, show competitive (or better) performances compared to several multilingual pretrained language models finetuned on single tasks. Additionally, we perform a few ablation studies to leverage the efficiency of two different loss combination strategies and find out that the equal loss weighting approach works best in our case. Our code is open-sourced at https://github.com/bonaventuredossou/multitask_fon.