A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation

Leite, João A., Arora, Arnav, Gargova, Silvia, Luz, João, Sampaio, Gustavo, Roberts, Ian, Scarton, Carolina, Bontcheva, Kalina

arXiv.org Artificial Intelligence 

While Large Language Models (LLMs) have made agentic AI, chatbots, and other intelligent applications possible, they have also enabled the affordable creation of highly convincing AI-generated disinformation (Bontcheva et al., 2024), which poses a systemic risk to democratic stability and global security (VIGINUM, 2025; Bengio, 2025). Initially, AI-generated texts suffered from linguistic mistakes and thus were more easily detectable by humans. However, modern LLMs, particularly instruction-tuned models, have significantly improved in producing outputs which are indistinguishable from human-written text (Spitale et al., 2023; Heppell et al., 2024). These advances have resulted in their misuse in generating persuasive disinformation narratives, including political manipulation, health disinformation, conspiracy propagation, and Foreign Information Manipulation and Interference (FIMI) (Vykopal et al., 2024; Chen and Shu, 2024a; Barman et al., 2024; Chen and Shu, 2024b; Heppell et al., 2024; VIGINUM, 2025). While there is a growing body of research on the generation and detection of LLM-produced disinformation (Chen and Shu, 2024a; Lucas et al., 2023; Vykopal et al., 2024; Heppell et al., 2024), a critical aspect remains largely unstudied - namely, whether LLMs are capable of generating fluent and convincing personalised disinformation (i.e., disinformation narratives tailored to specific audiences) in multiple languages and at scale. The few prior studies on AIgenerated personalised disinformation are limited to English and address a very narrow set of personas (e.g., students, parents) (Zugecova et al., 2024). Crucially, prior work has not yet examined whether LLMs can adapt disinformation to country-specific linguistic and cultural contexts in multiple languages.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found