AITopics | Tonneau, Manuel

Collaborating Authors

Tonneau, Manuel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits

Magomere, Jabez, La Malfa, Emanuele, Tonneau, Manuel, Kazemi, Ashkan, Hale, Scott

arXiv.org Artificial IntelligenceMar-6-2025

Online misinformation remains a critical challenge, and fact-checkers increasingly rely on embedding-based methods to retrieve relevant fact-checks. Yet, when debunked claims reappear in edited forms, the performance of these methods is unclear. In this work, we introduce a taxonomy of six common real-world misinformation edits and propose a perturbation framework that generates valid, natural claim variations. Our multi-stage retrieval evaluation reveals that standard embedding models struggle with user-introduced edits, while LLM-distilled embeddings offer improved robustness at a higher computational cost. Although a strong reranker helps mitigate some issues, it cannot fully compensate for first-stage retrieval gaps. Addressing these retrieval gaps, our train- and inference-time mitigation approaches enhance in-domain robustness by up to 17 percentage points and boost out-of-domain generalization by 10 percentage points over baseline models. Overall, our findings provide practical improvements to claim-matching systems, enabling more reliable fact-checking of evolving misinformation.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2503.03417

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Information Technology (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Communications > Social Media (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Tonneau, Manuel, de Castro, Pedro Vitor Quinta, Lasri, Karim, Farouq, Ibrahim, Subramanian, Lakshminarayanan, Orozco-Olvera, Victor, Fraiberger, Samuel P.

arXiv.org Artificial IntelligenceJun-24-2024

To address the global issue of online hate, hate speech detection (HSD) systems are typically developed on datasets from the United States, thereby failing to generalize to English dialects from the Majority World. Furthermore, HSD models are often evaluated on non-representative samples, raising concerns about overestimating model performance in real-world settings. In this work, we introduce NaijaHate, the first dataset annotated for HSD which contains a representative sample of Nigerian tweets. We demonstrate that HSD evaluated on biased datasets traditionally used in the literature consistently overestimates real-world performance by at least two-fold. We then propose NaijaXLM-T, a pretrained model tailored to the Nigerian Twitter context, and establish the key role played by domain-adaptive pretraining and finetuning in maximizing HSD performance. Finally, owing to the modest performance of HSD systems in real-world conditions, we find that content moderators would need to review about ten thousand Nigerian tweets flagged as hateful daily to moderate 60% of all hateful content, highlighting the challenges of moderating hate speech at scale as social media usage continues to grow globally. Taken together, these results pave the way towards robust HSD systems and a better protection of social media users from hateful content in low-resource settings.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.1926

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets

Tonneau, Manuel, Liu, Diyi, Fraiberger, Samuel, Schroeder, Ralph, Hale, Scott A., Röttger, Paul

arXiv.org Artificial IntelligenceApr-27-2024

Perceptions of hate can vary greatly across cultural contexts. Hate speech (HS) datasets, however, have traditionally been developed by language. This hides potential cultural biases, as one language may be spoken in different countries home to different cultures. In this work, we evaluate cultural bias in HS datasets by leveraging two interrelated cultural proxies: language and geography. We conduct a systematic survey of HS datasets in eight languages and confirm past findings on their English-language bias, but also show that this bias has been steadily decreasing in the past few years. For three geographically-widespread languages -- English, Arabic and Spanish -- we then leverage geographical metadata from tweets to approximate geo-cultural contexts by pairing language and country information. We find that HS datasets for these languages exhibit a strong geo-cultural bias, largely overrepresenting a handful of countries (e.g., US and UK for English) relative to their prominence in both the broader social media population and the general population speaking these languages. Based on these findings, we formulate recommendations for the creation of future HS datasets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.17874

Country:

South America (1.00)
Europe (1.00)
Asia > Middle East (1.00)
(3 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Services (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management (0.93)

Add feedback

Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West

Khandelwal, Khyati, Tonneau, Manuel, Bean, Andrew M., Kirk, Hannah Rose, Hale, Scott A.

arXiv.org Artificial IntelligenceSep-15-2023

Large Language Models (LLMs), now used daily by millions of users, can encode societal biases, exposing their users to representational harms. A large body of scholarship on LLM bias exists but it predominantly adopts a Western-centric frame and attends comparatively less to bias levels and potential harms in the Global South. In this paper, we quantify stereotypical bias in popular LLMs according to an Indian-centric frame and compare bias levels between the Indian and Western contexts. To do this, we develop a novel dataset which we call Indian-BhED (Indian Bias Evaluation Dataset), containing stereotypical and anti-stereotypical examples for caste and religion contexts. We find that the majority of LLMs tested are strongly biased towards stereotypes in the Indian context, especially as compared to the Western context. We finally investigate Instruction Prompting as a simple intervention to mitigate such bias and find that it significantly reduces both stereotypical and anti-stereotypical biases in the majority of cases for GPT-3.5. The findings of this work highlight the need for including more diverse voices when evaluating LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.08573

Country:

Europe (0.68)
Asia > India (0.67)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)

Industry: Law > Civil Rights & Constitutional Law (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback