Hacheme, Gilles
Distribution Shifts at Scale: Out-of-distribution Detection in Earth Observation
Ekim, Burak, Tadesse, Girmaw Abebe, Robinson, Caleb, Hacheme, Gilles, Schmitt, Michael, Dodhia, Rahul, Ferres, Juan M. Lavista
Training robust deep learning models is critical in Earth Observation, where globally deployed models often face distribution shifts that degrade performance, especially in low-data regions. Out-of-distribution (OOD) detection addresses this challenge by identifying inputs that differ from in-distribution (ID) data. However, existing methods either assume access to OOD data or compromise primary task performance, making them unsuitable for real-world deployment. We propose TARDIS, a post-hoc OOD detection method for scalable geospatial deployments. The core novelty lies in generating surrogate labels by integrating information from ID data and unknown distributions, enabling OOD detection at scale. Our method takes a pre-trained model, ID data, and WILD samples, disentangling the latter into surrogate ID and surrogate OOD labels based on internal activations, and fits a binary classifier as an OOD detector. We validate TARDIS on EuroSAT and xBD datasets, across 17 experimental setups covering covariate and semantic shifts, showing that it performs close to the theoretical upper bound in assigning surrogate ID and OOD samples in 13 cases. To demonstrate scalability, we deploy TARDIS on the Fields of the World dataset, offering actionable insights into pre-trained model behavior for large-scale deployments. The code is publicly available at https://github.com/microsoft/geospatial-ood-detection.
FonMTL: Towards Multitask Learning for the Fon Language
Dossou, Bonaventure F. P., Houndayi, Iffanice, Zantou, Pamely, Hacheme, Gilles
The Fon language, spoken by an average 2 million of people, is a truly low-resourced African language, with a limited online presence, and existing datasets (just to name but a few). Multitask learning is a learning paradigm that aims to improve the generalization capacity of a model by sharing knowledge across different but related tasks: this could be prevalent in very data-scarce scenarios. In this paper, we present the first explorative approach to multitask learning, for model capabilities enhancement in Natural Language Processing for the Fon language. Specifically, we explore the tasks of Named Entity Recognition (NER) and Part of Speech Tagging (POS) for Fon. We leverage two language model heads as encoders to build shared representations for the inputs, and we use linear layers blocks for classification relative to each task. Our results on the NER and POS tasks for Fon, show competitive (or better) performances compared to several multilingual pretrained language models finetuned on single tasks. Additionally, we perform a few ablation studies to leverage the efficiency of two different loss combination strategies and find out that the equal loss weighting approach works best in our case. Our code is open-sourced at https://github.com/bonaventuredossou/multitask_fon.
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages
Ogundepo, Odunayo, Gwadabe, Tajuddeen R., Rivera, Clara E., Clark, Jonathan H., Ruder, Sebastian, Adelani, David Ifeoluwa, Dossou, Bonaventure F. P., DIOP, Abdou Aziz, Sikasote, Claytone, Hacheme, Gilles, Buzaaba, Happy, Ezeani, Ignatius, Mabuya, Rooweither, Osei, Salomey, Emezue, Chris, Kahira, Albert Njoroge, Muhammad, Shamsuddeen H., Oladipo, Akintunde, Owodunni, Abraham Toluwase, Tonja, Atnafu Lambebo, Shode, Iyanuoluwa, Asai, Akari, Ajayi, Tunde Oluwaseyi, Siro, Clemencia, Arthur, Steven, Adeyemi, Mofetoluwa, Ahia, Orevaoghene, Aremu, Anuoluwapo, Awosan, Oyinkansola, Chukwuneke, Chiamaka, Opoku, Bernard, Ayodele, Awokoya, Otiende, Verrah, Mwase, Christine, Sinkala, Boyd, Rubungo, Andre Niyongabo, Ajisafe, Daniel A., Onwuegbuzia, Emeka Felix, Mbow, Habib, Niyomutabazi, Emile, Mukonde, Eunice, Lawan, Falalu Ibrahim, Ahmad, Ibrahim Said, Alabi, Jesujoba O., Namukombo, Martin, Chinedu, Mbonu, Phiri, Mofya, Putini, Neo, Mngoma, Ndumiso, Amuok, Priscilla A., Iro, Ruqayya Nasir, Adhiambo, Sonia
African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create AfriQA, the first cross-lingual QA dataset with a focus on African languages. AfriQA includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, AfriQA focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, AfriQA proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.
Neural Fashion Image Captioning : Accounting for Data Diversity
Hacheme, Gilles, Sayouti, Noureini
Image captioning has increasingly large domains of application, and fashion is not an exception. Having automatic item descriptions is of great interest for fashion web platforms, sometimes hosting hundreds of thousands of images. This paper is one of the first to tackle image captioning for fashion images. To address dataset diversity issues, we introduced the InFashAIv1 dataset containing almost 16.000 African fashion item images with their titles, prices, and general descriptions. We also used the well-known DeepFashion dataset in addition to InFashAIv1. Captions are generated using the Show and Tell model made of CNN encoder and RNN Decoder. We showed that jointly training the model on both datasets improves captions quality for African style fashion images, suggesting a transfer learning from Western style data. The InFashAIv1 dataset is released on Github to encourage works with more diversity inclusion.