Goto

Collaborating Authors

 villain


Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

Yi, Zihao, Jiang, Qingxuan, Ma, Ruotian, Chen, Xingyu, Yang, Qu, Wang, Mengru, Ye, Fanghua, Shen, Ying, Tu, Zhaopeng, Li, Xiaolong, Linus, null

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly tasked with creative generation, including the simulation of fictional characters. However, their ability to portray non-prosocial, antagonistic personas remains largely unexamined. We hypothesize that the safety alignment of modern LLMs creates a fundamental conflict with the task of authentically role-playing morally ambiguous or villainous characters. To investigate this, we introduce the Moral RolePlay benchmark, a new dataset featuring a four-level moral alignment scale and a balanced test set for rigorous evaluation. We task state-of-the-art LLMs with role-playing characters from moral paragons to pure villains. Our large-scale evaluation reveals a consistent, monotonic decline in role-playing fidelity as character morality decreases. We find that models struggle most with traits directly antithetical to safety principles, such as ``Deceitful'' and ``Manipulative'', often substituting nuanced malevolence with superficial aggression. Furthermore, we demonstrate that general chatbot proficiency is a poor predictor of villain role-playing ability, with highly safety-aligned models performing particularly poorly. Our work provides the first systematic evidence of this critical limitation, highlighting a key tension between model safety and creative fidelity. Our benchmark and findings pave the way for developing more nuanced, context-aware alignment methods.


Decoding Memes: Benchmarking Narrative Role Classification across Multilingual and Multimodal Models

Sharma, Shivam, Chakraborty, Tanmoy

arXiv.org Artificial Intelligence

Abstract--This work investigates the challenging task of identifying narrative roles - Hero, Villain, Victim, and Other - in Internet memes, across three diverse test sets spanning English and code-mixed (English-Hindi) languages. Building on an annotated dataset originally skewed toward the'Other' class, we explore a more balanced and linguistically diverse extension, originally introduced as part of the CLEF 2024 shared task. Comprehensive lexical and structural analyses highlight the nuanced, culture-specific, and context-rich language used in real memes, in contrast to synthetically curated hateful content, which exhibits explicit and repetitive lexical markers. T o benchmark the role detection task, we evaluate a wide spectrum of models, including fine-tuned multilingual transformers, sentiment and abuse-aware classifiers, instruction-tuned LLMs, and multimodal vision-language models. Performance is assessed under zero-shot settings using precision, recall, and F1 metrics. W e also explore prompt design strategies to guide multi-modal models and find that hybrid prompts incorporating structured instructions and role definitions offer marginal yet consistent improvements. Our findings underscore the importance of cultural grounding, prompt engineering, and multimodal reasoning in modelling subtle narrative framings in visual-textual content. W arning: This paper contains potentially harmful and offensive content. I. Introduction Social media platforms have become pivotal arenas for rapid information dissemination. However, this openness has also catalysed the proliferation of harmful content - including hate speech, propaganda, and misinformation, often embedded within memes [1], [2]. Memes, with their multimodal structure and cultural resonance, are particularly potent in shaping public opinion and propagating ideologies.


We Have Our First Great Summer Movie Disappointment of 2025

Slate

The taglines on M3GAN 2.0 posters read like text messages from an overconfident tween: "HEY, QUEENS." "MISS ME?" "I'M STILL THAT B." (Another that apparently exists, though I haven't seen in the wild, hilariously reads: "THIS BITCH.") Next to them, the titular robot who looks like an uncanny-valley Olsen twin peers from above circular sunglasses. This character that, per her 2023 film debut, will kill you and your little dog, too, is now being marketed with big child-star energy. While she always had more to offer than malice (her late-movie dance break went viral from its trailer alone), this moment marks a clear pivot on M3GAN's Mary Janes.


"Mountainhead" Channels the Absurdity of the Tech Bro

The New Yorker

Four tech billionaires walk into a mansion. It sounds like the setup for a punch line, but it also forms nearly the entire conceit behind "Mountainhead," a savagely entertaining but somewhat shallow new satire written and directed by Jesse Armstrong, the creator of "Succession." The film, which is streaming on HBO's Max, is a sort of chamber play, its stage a modernist castle in Utah--the Mountainhead of the title--overlooking snowy peaks. The players are a quartet of friends, or, more accurately, frenemies, who resemble a mishmash of real-world Silicon Valley founders. Steve Carell plays Randall Garrett, the group's Peter Thiel-esque mentor who, not unlike the late Steve Jobs, has cancer that his doctor tells him is incurable.


Narrative Media Framing in Political Discourse

Otmakhova, Yulia, Frermann, Lea

arXiv.org Artificial Intelligence

Narrative frames are a powerful way of conceptualizing and communicating complex, controversial ideas, however automated frame analysis to date has mostly overlooked this framing device. In this paper, we connect elements of narrativity with fundamental aspects of framing, and present a framework which formalizes and operationalizes such aspects. We annotate and release a data set of news articles in the climate change domain, analyze the dominance of narrative frame components across political leanings, and test LLMs in their ability to predict narrative frames and their components. Finally, we apply our framework in an unsupervised way to elicit components of narrative framing in a second domain, the COVID-19 crisis, where our predictions are congruent with prior theoretical work showing the generalizability of our approach.


The Problem With em Dune: Part Two /em

Slate

I have questions about Denis Villeneuve's Dune: Part Two. If the Fremen have lasers, why don't they just shoot the sand harvesters and run away? Why don't they use their sandworms until the last battle? Wouldn't it make more sense to fight the other great houses on Arrakis itself, where they have sandworms, rather than board ships off-world to go off to war? If Paul (Timothée Chalamet) has to invade the galaxy at the end, why bother marrying the daughter of the emperor he just deposed?


'Mission: Impossible--Dead Reckoning' Is the Perfect AI Panic Movie

WIRED

American action movie villains have always acted as a sort of paranoia litmus test, capturing a snapshot of the particular anxieties plaguing the country and its citizens at any given time. In the 1990s and '00s, with the Red Menace long forgotten, movies leaned heavily on the awful "bad Arab" trope, pulling their villains from the Middle East. Other recent smash-'em-ups have made bad guys out of rogue spies, shadowy cyber terrorists, and self-interested arms dealers, all common players in the global news landscape. But for Mission: Impossible--Dead Reckoning Part One, out this week, writers Bruce Geller, Erik Jendresen, and Christopher McQuarrie (who also directed the movie) made their big bad--known as The Entity--out of a slightly more amorphous fear: that of an all-powerful, all-seeing, sentient AI. It has access to anything with an online network and can use those evil techno powers to manipulate everything from global military superpowers to a grandma with a gun.


How this non-gamer fell in love with 'The Legend of Zelda: Breath of the Wild'

Engadget

It was after a particularly grueling session with The Legend of Zelda: Breath of the Wild that I started to wonder: When did developers stop putting cheats into their games to help the less talented among us get through the tricky bits? When I was a kid, a little bit of Up Down Left Right A and Start together, and a little older, a little / noclip saved me no end of bother. These days, if you look for cheats for any modern game online, the best you'll get is to be sassily told to "git gud." Sorry, a little context: I play games, but I'm not a Gamer, or a Nintendo Person, so in 2023 I resolved to remedy this. So many discussions at work fly past me because while I've heard of Cliff Bleszinski and Hironobu Sakaguchi, I couldn't tell you their oeuvre without Googling.

  Country: North America > United States > New York (0.05)
  Genre: Personal > Human Interest (0.40)
  Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?

Sharma, Shivam, Kulkarni, Atharva, Suresh, Tharun, Mathur, Himanshi, Nakov, Preslav, Akhtar, Md. Shad, Chakraborty, Tanmoy

arXiv.org Artificial Intelligence

Memes can sway people's opinions over social media as they combine visual and textual information in an easy-to-consume manner. Since memes instantly turn viral, it becomes crucial to infer their intent and potentially associated harmfulness to take timely measures as needed. A common problem associated with meme comprehension lies in detecting the entities referenced and characterizing the role of each of these entities. Here, we aim to understand whether the meme glorifies, vilifies, or victimizes each entity it refers to. To this end, we address the task of role identification of entities in harmful memes, i.e., detecting who is the 'hero', the 'villain', and the 'victim' in the meme, if any. We utilize HVVMemes - a memes dataset on US Politics and Covid-19 memes, released recently as part of the CONSTRAINT@ACL-2022 shared-task. It contains memes, entities referenced, and their associated roles: hero, villain, victim, and other. We further design VECTOR (Visual-semantic role dEteCToR), a robust multi-modal framework for the task, which integrates entity-based contextual information in the multi-modal representation and compare it to several standard unimodal (text-only or image-only) or multi-modal (image+text) models. Our experimental results show that our proposed model achieves an improvement of 4% over the best baseline and 1% over the best competing stand-alone submission from the shared-task. Besides divulging an extensive experimental setup with comparative analyses, we finally highlight the challenges encountered in addressing the complex task of semantic role labeling within memes.


9 Best Sci-Fi Movies About A.I. Taking Over (Excluding Terminator)

#artificialintelligence

Science fiction is one of the most popular genres. It's ideal for people who like to dream about what could become real one day. However, not all technological advancements are portrayed as a good thing in sci-fi movies. In multiple cases, artificial intelligence is developed, or it builds malicious intentions over time, and it decides to take over the entire world or at least part of it. It often does this even if it means hurting those who built it in the first place -- the humans.