Goto

Collaborating Authors

 misinformation


Elon Musk's Grok AI generates images of 'minors in minimal clothing'

The Guardian

Grok has a history of failing to maintain its safety guardrails and posting misinformation. Grok has a history of failing to maintain its safety guardrails and posting misinformation. Elon Musk's Grok AI generates images of'minors in minimal clothing' Elon Musk's chatbot Grok posted on Friday that lapses in safeguards had led it to generate "images depicting minors in minimal clothing" on social media platform X. The chatbot, a product of Musk's company xAI, has been generating a wave of sexualized images throughout the week in response to user prompts. Screenshots shared by users on X showed Grok's public media tab filled with such images.


CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care

Neural Information Processing Systems

The recent advances in natural language processing (NLP), have led to a new trend of applying large language models (LLMs) to real-world scenarios. While the latest LLMs are astonishingly fluent when interacting with humans, they suffer from the misinformation problem by unintentionally generating factually false statements. This can lead to harmful consequences, especially when produced within sensitive contexts, such as healthcare. Yet few previous works have focused on evaluating misinformation in the long-form (LF) generation of LLMs, especially for knowledge-intensive topics. Moreover, although LLMs have been shown to perform well in different languages, misinformation evaluation has been mostly conducted in English.


Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models

Neural Information Processing Systems

Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, but their versatility raises security concerns. This study takes the first step in exposing VLMs' susceptibility to data poisoning attacks that can manipulate responses to innocuous, everyday prompts. We introduce Shadowcast, a stealthy data poisoning attack where poison samples are visually indistinguishable from benign images with matching texts. Shadowcast demonstrates effectiveness in two attack types. The first is a traditional Label Attack, tricking VLMs into misidentifying class labels, such as confusing Donald Trump for Joe Biden.


Simulating Misinformation Propagation in Social Networks using Large Language Models

Maurya, Raj Gaurav, Shukla, Vaibhav, Dandekar, Raj Abhijit, Dandekar, Rajat, Panat, Sreedath

arXiv.org Artificial Intelligence

Misinformation on social media thrives on surprise, emotion, and identity-driven reasoning, often amplified through human cognitive biases. To investigate these mechanisms, we model large language model (LLM) personas as synthetic agents that mimic user-level biases, ideological alignments, and trust heuristics. Within this setup, we introduce an auditor--node framework to simulate and analyze how misinformation evolves as it circulates through networks of such agents. News articles are propagated across networks of persona-conditioned LLM nodes, each rewriting received content. A question--answering-based auditor then measures factual fidelity at every step, offering interpretable, claim-level tracking of misinformation drift. We formalize a misinformation index and a misinformation propagation rate to quantify factual degradation across homogeneous and heterogeneous branches of up to 30 sequential rewrites. Experiments with 21 personas across 10 domains reveal that identity- and ideology-based personas act as misinformation accelerators, especially in politics, marketing, and technology. By contrast, expert-driven personas preserve factual stability. Controlled-random branch simulations further show that once early distortions emerge, heterogeneous persona interactions rapidly escalate misinformation to propaganda-level distortion. Our taxonomy of misinformation severity -- spanning factual errors, lies, and propaganda -- connects observed drift to established theories in misinformation studies. These findings demonstrate the dual role of LLMs as both proxies for human-like biases and as auditors capable of tracing information fidelity. The proposed framework provides an interpretable, empirically grounded approach for studying, simulating, and mitigating misinformation diffusion in digital ecosystems.


Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification

Mamtani, Sumit, Bhure, Abhijeet

arXiv.org Artificial Intelligence

Abstract--This paper investigates fake news detection as a downstream evaluation of Transformer representations, bench-marking encoder-only and decoder-only pre-trained models (BERT, GPT -2, Transformer-XL) as frozen embedders paired with lightweight classifiers. Through controlled preprocessing comparing pooling versus padding and neural versus linear heads, results demonstrate that contextual self-attention encodings consistently transfer effectively. BERT embeddings combined with logistic regression outperform neural baselines on LIAR dataset splits, while analyses of sequence length and aggregation reveal robustness to truncation and advantages from simple max or average pooling. In the pre-digital era, the dissemination of information to mass audiences was predominantly controlled by established publishing organizations and media conglomerates that maintained editorial standards and fact-checking processes. The advent of the Internet and the subsequent proliferation of social media platforms have fundamentally transformed this landscape, democratizing information sharing by enabling any individual to broadcast news and content to global audiences with unprecedented speed and scale [6]. While this democratization has fostered greater accessibility to diverse perspectives, it has simultaneously introduced significant challenges to ensuring the validity, authenticity, and reliability of the information being circulated [8].


Insight-A: Attribution-aware for Multimodal Misinformation Detection

Wu, Junjie, Fu, Yumeng, Gong, Chen, Fu, Guohong

arXiv.org Artificial Intelligence

AI-generated content (AIGC) technology has emerged as a prevalent alternative to create multimodal misinformation on social media platforms, posing unprecedented threats to societal safety. However, standard prompting leverages multimodal large language models (MLLMs) to identify the emerging misinformation, which ignores the misinformation attribution. To this end, we present Insight-A, exploring attribution with MLLM insights for detecting multimodal misinformation. Insight-A makes two efforts: I) attribute misinformation to forgery sources, and II) an effective pipeline with hierarchical reasoning that detects distortions across modalities. Specifically, to attribute misinformation to forgery traces based on generation patterns, we devise cross-attribution prompting (CAP) to model the sophisticated correlations between perception and reasoning. Meanwhile, to reduce the subjectivity of human-annotated prompts, automatic attribution-debiased prompting (ADP) is used for task adaptation on MLLMs. Additionally, we design image captioning (IC) to achieve visual details for enhancing cross-modal consistency checking. Extensive experiments demonstrate the superiority of our proposal and provide a new paradigm for multimodal misinformation detection in the era of AIGC.


Evaluating the Simulation of Human Personality-Driven Susceptibility to Misinformation with LLMs

Pratelli, Manuel, Petrocchi, Marinella

arXiv.org Artificial Intelligence

Large language models (LLMs) make it possible to generate synthetic behavioural data at scale, offering an ethical and low-cost alternative to human experiments. Whether such data can faithfully capture psychological differences driven by personality traits, however, remains an open question. We evaluate the capacity of LLM agents, conditioned on Big-Five profiles, to reproduce personality-based variation in susceptibility to misinformation, focusing on news discernment, the ability to judge true headlines as true and false headlines as false. Leveraging published datasets in which human participants with known personality profiles rated headline accuracy, we create matching LLM agents and compare their responses to the original human patterns. Certain trait-misinformation associations, notably those involving Agreeableness and Conscientiousness, are reliably replicated, whereas others diverge, revealing systematic biases in how LLMs internalize and express personality. The results underscore both the promise and the limits of personality-aligned LLMs for behavioral simulation, and offer new insight into modeling cognitive diversity in artificial agents.


From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation

Zhang, Zhihao, Zhang, Yiran, Zhou, Xiyue, Huang, Liting, Razzak, Imran, Nakov, Preslav, Naseem, Usman

arXiv.org Artificial Intelligence

Infodemics and health misinformation have significant negative impact on individuals and society, exacerbating confusion and increasing hesitancy in adopting recommended health measures. Recent advancements in generative AI, capable of producing realistic, human like text and images, have significantly accelerated the spread and expanded the reach of health misinformation, resulting in an alarming surge in its dissemination. To combat the infodemics, most existing work has focused on developing misinformation datasets from social media and fact checking platforms, but has faced limitations in topical coverage, inclusion of AI generation, and accessibility of raw content. To address these issues, we present MM Health, a large scale multimodal misinformation dataset in the health domain consisting of 34,746 news article encompassing both textual and visual information. MM Health includes human-generated multimodal information (5,776 articles) and AI generated multimodal information (28,880 articles) from various SOTA generative AI models. Additionally, We benchmarked our dataset against three tasks (reliability checks, originality checks, and fine-grained AI detection) demonstrating that existing SOTA models struggle to accurately distinguish the reliability and origin of information. Our dataset aims to support the development of misinformation detection across various health scenarios, facilitating the detection of human and machine generated content at multimodal levels.


Chatbots to strengthen democracy: An interdisciplinary seminar to train identifying argumentation techniques of science denial

Siegert, Ingo, Nehring, Jan, Ampudia, Aranxa Márquez, Busch, Matthias, Hillmann, Stefan

arXiv.org Artificial Intelligence

In recent times, discussions on social media platforms have increasingly come under scrutiny due to the proliferation of science denial and fake news. Traditional solutions, such as regulatory actions, have been implemented to mitigate the spread of misinformation; however, these measures alone are not sufficient. To complement these efforts, educational approaches are becoming essential in empowering users to critically engage with misinformation. Conversation training, through serious games or personalized methods, has emerged as a promising strategy to help users handle science denial and toxic conversation tactics. This paper suggests an interdisciplinary seminar to explore the suitability of Large Language Models (LLMs) acting as a persona of a science denier to support people in identifying misinformation and improving resilience against toxic interactions. In the seminar, groups of four to five students will develop an AI-based chatbot that enables realistic interactions with science-denial argumentation structures. The task involves planning the setting, integrating a Large Language Model to facilitate natural dialogues, implementing the chatbot using the RASA framework, and evaluating the outcomes in a user study. It is crucial that users understand what they need to do during the interaction, how to conclude it, and how the relevant information is conveyed. The seminar does not aim to develop chatbots for practicing debunking but serves to teach AI technologies and test the feasibility of this idea for future applications. The chatbot seminar is conducted as a hybrid, parallel master's module at the participating educational institutions.


What is AI poisoning? A computer scientist explains

AIHub

Poisoning is a term most often associated with the human body and natural environments . But it is also a growing problem in the world of artificial intelligence (AI) - in particular, for large language models such as ChatGPT and Claude. In fact, a joint study by the UK AI Security Institute, Alan Turing Institute and Anthropic, published earlier this month, found that inserting as few as 250 malicious files into the millions in a model's training data can secretly "poison" it. So what exactly is AI poisoning? And what risks does it pose?