AITopics | Nouakchott

Collaborating Authors

Nouakchott

Pearl: A Multimodal Culturally-Aware Arabic Instruction Dataset

Alwajih, Fakhraddin, Magdy, Samar M., Mekki, Abdellah El, Nacar, Omer, Nafea, Youssef, Abdelfadil, Safaa Taher, Yahya, Abdulfattah Mohammed, Luqman, Hamzah, Almarwani, Nada, Aloufi, Samah, Qawasmen, Baraah, Atou, Houdaifa, Sibaee, Serry, Alsayadi, Hamzah A., Al-Dhabyani, Walid, Al-shaibani, Maged S., Aatar, Aya El, Qandos, Nour, Alhamouri, Rahaf, Ahmad, Samar, Al-Ghrawi, Mohammed Anwar, Yacoub, Aminetou, AbuHweidi, Ruwa, Lemin, Vatimetou Mohamed, Abdel-Salam, Reem, Bashiti, Ahlam, Alansari, Aisha, Ashraf, Ahmed, Alturayeif, Nora, Inciarte, Alcides Alcoba, Ammar, Adel, Elmadany, Abdelrahim A., Tourad, Mohamedou Cheikh, Berrada, Ismail, Jarrar, Mustafa, Shehata, Shady, Abdul-Mageed, Muhammad

arXiv.org Artificial IntelligenceSep-30-2025

Mainstream large vision-language models (LVLMs) inherently encode cultural biases, highlighting the need for diverse multimodal datasets. To address this gap, we introduce PEARL, a large-scale Arabic multimodal dataset and benchmark explicitly designed for cultural understanding. Constructed through advanced agentic workflows and extensive human-in-the-loop annotations by 37 annotators from across the Arab world, PEARL comprises over 309K multimodal examples spanning ten culturally significant domains covering all Arab countries. We further provide two robust evaluation benchmarks (PEARL and PEARL-LITE) along with a specialized subset (PEARL-X) explicitly developed to assess nuanced cultural variations. Comprehensive evaluations on state-of-the-art open and proprietary LVLMs demonstrate that reasoning-centric instruction alignment substantially improves models' cultural grounding compared to conventional scaling methods. PEARL establishes a foundational resource for advancing culturally-informed multimodal modeling research. All datasets and benchmarks are publicly available.

benchmark, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.21979

Country:

Africa > Sudan (0.28)
Asia > Middle East > Saudi Arabia (0.14)
Asia > Middle East > Yemen (0.14)
(26 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

NativQA Framework: Enabling LLMs with Native, Local, and Everyday Knowledge

Alam, Firoj, Hasan, Md Arid, Laskar, Sahinur Rahman, Kutlu, Mucahid, Darwish, Kareem, Chowdhury, Shammur Absar

arXiv.org Artificial IntelligenceJul-8-2025

The rapid advancement of large language models (LLMs) has raised concerns about cultural bias, fairness, and their applicability in diverse linguistic and underrepresented regional contexts. To enhance and benchmark the capabilities of LLMs, there is a need to develop large-scale resources focused on multilingual, local, and cultural contexts. In this study, we propose the NativQA framework, which can seamlessly construct large-scale, culturally and regionally aligned QA datasets in native languages. The framework utilizes user-defined seed queries and leverages search engines to collect location-specific, everyday information. It has been evaluated across 39 locations in 24 countries and in 7 languages -- ranging from extremely low-resource to high-resource languages -- resulting in over 300K Question-Answer (QA) pairs. The developed resources can be used for LLM benchmarking and further fine-tuning. The framework has been made publicly available for the community (https://gitlab.com/nativqa/nativqa-framework).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.05995

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > Yemen > Amanat Al Asimah > Sanaa (0.04)
(45 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs

Alwajih, Fakhraddin, Mekki, Abdellah El, Magdy, Samar Mohamed, Elmadany, Abdelrahim A., Nacar, Omer, Nagoudi, El Moatez Billah, Abdel-Salam, Reem, Atwany, Hanin, Nafea, Youssef, Yahya, Abdulfattah Mohammed, Alhamouri, Rahaf, Alsayadi, Hamzah A., Zayed, Hiba, Shatnawi, Sara, Sibaee, Serry, Ech-Chammakhy, Yasir, Al-Dhabyani, Walid, Ali, Marwa Mohamed, Jarraya, Imen, El-Shangiti, Ahmed Oumar, Alraeesi, Aisha, Al-Ghrawi, Mohammed Anwar, Al-Batati, Abdulrahman S., Mohamed, Elgizouli, Elgindi, Noha Taha, Saeed, Muhammed, Atou, Houdaifa, Yahia, Issam Ait, Bouayad, Abdelhak, Machrouh, Mohammed, Makouar, Amal, Alkawi, Dania, Mohamed, Mukhtar, Abdelfadil, Safaa Taher, Ounnoughene, Amine Ziad, Anfel, Rouabhia, Assi, Rwaa, Sorkatti, Ahmed, Tourad, Mohamedou Cheikh, Koubaa, Anis, Berrada, Ismail, Jarrar, Mustafa, Shehata, Shady, Abdul-Mageed, Muhammad

arXiv.org Artificial IntelligenceFeb-28-2025

As large language models (LLMs) become increasingly integrated into daily life, ensuring their cultural sensitivity and inclusivity is paramount. We introduce our dataset, a year-long community-driven project covering all 22 Arab countries. The dataset includes instructions (input, response pairs) in both Modern Standard Arabic (MSA) and dialectal Arabic (DA), spanning 20 diverse topics. Built by a team of 44 researchers across the Arab world, all of whom are authors of this paper, our dataset offers a broad, inclusive perspective. We use our dataset to evaluate the cultural and dialectal capabilities of several frontier LLMs, revealing notable limitations. For instance, while closed-source LLMs generally exhibit strong performance, they are not without flaws, and smaller open-source models face greater challenges. Moreover, certain countries (e.g., Egypt, the UAE) appear better represented than others (e.g., Iraq, Mauritania, Yemen). Our annotation guidelines, code, and data for reproducibility are publicly available.

computational linguistic, dataset, instruction, (16 more...)

arXiv.org Artificial Intelligence

2503.00151

Country:

Asia > Middle East > UAE (0.25)
Asia > Middle East > Iraq (0.25)
Asia > Middle East > Yemen (0.24)
(29 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Casablanca: Data and Models for Multidialectal Arabic Speech Recognition

Talafha, Bashar, Kadaoui, Karima, Magdy, Samar Mohamed, Habiboullah, Mariem, Chafei, Chafei Mohamed, El-Shangiti, Ahmed Oumar, Zayed, Hiba, tourad, Mohamedou cheikh, Alhamouri, Rahaf, Assi, Rwaa, Alraeesi, Aisha, Mohamed, Hour, Alwajih, Fakhraddin, Mohamed, Abdelrahman, Mekki, Abdellah El, Nagoudi, El Moatez Billah, Saadia, Benelhadj Djelloul Mama, Alsayadi, Hamzah A., Al-Dhabyani, Walid, Shatnawi, Sara, Ech-Chammakhy, Yasir, Makouar, Amal, Berrachedi, Yousra, Jarrar, Mustafa, Shehata, Shady, Berrada, Ismail, Abdul-Mageed, Muhammad

arXiv.org Artificial IntelligenceOct-6-2024

Arabic encompasses a diverse array of for a select few languages. This bias towards linguistic varieties, many of which are nearly mutually resource-rich languages leaves behind the majority unintelligible (Watson, 2007; Abdul-Mageed of the world's languages (Bartelds et al., 2023; et al., 2024). This diversity includes three primary Talafha et al., 2023; Meelen et al., 2024; Tonja categories: Classical Arabic, historically used in et al., 2024). In this work, we report our efforts literature and still employed in religious contexts; to alleviate this challenge for Arabic--a collection Modern Standard Arabic (MSA), used in media, of languages and dialects spoken by more than education, and governmental settings; and numerous 450 million people. We detail a year-long community colloquial dialects, which are the main forms effort to collect and annotate a novel dataset of daily communication across the Arab world and for eight Arabic dialects spanning both Africa and often involve code-switching (Abdul-Mageed et al., Asia. This new dataset, dubbed Casablanca, is rich 2020; Mubarak et al., 2021).

casablanca, dataset, dialect, (15 more...)

arXiv.org Artificial Intelligence

2410.04527

Country:

Africa > Middle East > Morocco > Casablanca-Settat Region > Casablanca (0.65)
North America > United States (0.28)
Asia > Middle East > UAE (0.05)
(12 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty

Zhou, Kaitlyn, Hwang, Jena D., Ren, Xiang, Sap, Maarten

arXiv.org Artificial IntelligenceJan-12-2024

As natural language becomes the default interface for human-AI interaction, there is a critical need for LMs to appropriately communicate uncertainties in downstream applications. In this work, we investigate how LMs incorporate confidence about their responses via natural language and how downstream users behave in response to LM-articulated uncertainties. We examine publicly deployed models and find that LMs are unable to express uncertainties when answering questions even when they produce incorrect responses. LMs can be explicitly prompted to express confidences, but tend to be overconfident, resulting in high error rates (on average 47%) among confident responses. We test the risks of LM overconfidence by running human experiments and show that users rely heavily on LM generations, whether or not they are marked by certainty. Lastly, we investigate the preference-annotated datasets used in RLHF alignment and find that humans have a bias against texts with uncertainty. Our work highlights a new set of safety harms facing human-LM interactions and proposes design recommendations and mitigating strategies moving forward.

certainty, epistemic marker, strengthener, (16 more...)

arXiv.org Artificial Intelligence

2401.0673

Country:

Africa > Mauritania > Nouakchott (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Education (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)

Add feedback

Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

Naggita, Keziah, LaChance, Julienne, Xiang, Alice

arXiv.org Artificial IntelligenceAug-16-2023

Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.

artificial intelligence, geotagged image, social media, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3600211.3604659

2308.08656

Country:

Asia > Brunei (0.14)
North America > Canada > Quebec > Montreal (0.06)
Africa > Sierra Leone (0.06)
(142 more...)

Genre: Research Report > Experimental Study (0.66)

Industry:

Health & Medicine (0.92)
Information Technology > Services (0.75)
Government > Regional Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

Abdul-Mageed, Muhammad, Zhang, Chiyu, Bouamor, Houda, Habash, Nizar

arXiv.org Artificial IntelligenceNov-9-2020

We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI). This Shared Task includes two subtasks: country-level dialect identification (Subtask 1) and province-level sub-dialect identification (Subtask 2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain. As such, NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level. A total of 61 teams from 25 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams.

arabic natural language processing workshop, proceedings, subtask 1, (8 more...)

arXiv.org Artificial Intelligence

2010.11334

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Africa > Middle East > Djibouti (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(63 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments

Abdul-Mageed, Muhammad, Zhang, Chiyu, Elmadany, AbdelRahim, Ungar, Lyle

arXiv.org Artificial IntelligenceOct-10-2020

Although the prediction of dialects is an important language processing task, with a wide range of applications, existing work is largely limited to coarse-grained varieties. Inspired by geolocation research, we propose the novel task of Micro-Dialect Identification (MDI) and introduce MARBERT, a new language model with striking abilities to predict a fine-grained variety (as small as that of a city) given a single, short message. For modeling, we offer a range of novel spatially and linguistically-motivated multi-task learning models. To showcase the utility of our models, we introduce a new, large-scale dataset of Arabic micro-varieties (low-resource) suited to our tasks. MARBERT predicts micro-dialects with 9.9% F1, ~76X better than a majority class baseline. Our new language model also establishes new state-of-the-art on several external tasks.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2010.049

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Middle East > Oman (0.05)
Asia > Middle East > Saudi Arabia (0.05)
(30 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.92)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Finding Generalizable Evidence by Learning to Convince Q&A Models

Perez, Ethan, Karamcheti, Siddharth, Fergus, Rob, Weston, Jason, Kiela, Douwe, Cho, Kyunghyun

arXiv.org Artificial IntelligenceSep-12-2019

We plot the judge's probability of the target answer given that sentence against how often humans also select that target answer given that same sentence. Humans tend to find a sentence to be strong evidence for an answer when the judge model finds it to be strong evidence. Strong evidence to a model tends to be strong evidence to humans as shown in Figure 7. Combined with the previous result, we can see that learned agents are more accurate at predicting sentences that humans find to be strong evidence. F Model Evaluation of Evidence on DREAM Figure 8 shows how convincing various judge models find each evidence agent. Our findings on DREAM are similar to those from RACE in §4.2. Figure 8: On DREAM, how often each judge selects an agent's answer when given a single agent-chosen sentence. The black line divides learned agents (right) and search agents (left), with human evidence selection in the leftmost column. All agents find evidence that convinces judge models more often than a no-evidence baseline (33%). Learned agents predicting p ( i) or p ( i) find the most broadly convincing evidence.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

1909.05863

Country:

Oceania > New Zealand (0.04)
Oceania > Australia (0.04)
North America > Canada (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

Add feedback