trope
UK watchdog bans 'shocking' ads in mobile games that objectified women
An investigation by the UK advertising watchdog has found a number of shocking ads in mobile gaming apps that depict women as sexual objects, use pornographic tropes, and feature non-consensual sexual scenarios involving "violent and coercive control". The Advertising Standards Authority (ASA) used avatars, which mimic the browsing behaviour of different gender and age groups, to monitor ads served when mobile games are open and identify breaches of the UK code. While most of the thousands of promotions served to the avatars complied with UK rules, the watchdog identified and banned eight that featured "shocking" content that portrayed women in a harmful way. Two ads promoting an artificial intelligence chatbot app, Linky: Chat With Characters AI, began with a woman dressed in a manga T-shirt, a short skirt and large bunny ears dancing in a bedroom with text reading: "Tell me which bf [boyfriend] I should break up with." The ad moved on to animated content featuring text conversations with three manga-style young men.
CHATTER: A Character Attribution Dataset for Narrative Understanding
Baruah, Sabyasachee, Narayanan, Shrikanth
Computational narrative understanding studies the identification, description, and interaction of the elements of a narrative: characters, attributes, events, and relations. Narrative research has given considerable attention to defining and classifying character types. However, these character-type taxonomies do not generalize well because they are small, too simple, or specific to a domain. We require robust and reliable benchmarks to test whether narrative models truly understand the nuances of the character's development in the story. Our work addresses this by curating the Chatter dataset that labels whether a character portrays some attribute for 88148 character-attribute pairs, encompassing 2998 characters, 13324 attributes and 660 movies. We validate a subset of Chatter, called ChatterEval, using human annotations to serve as an evaluation benchmark for the character attribution task in movie scripts. ChatterEval assesses narrative understanding and the long-context modeling capacity of language models.
- North America > United States > California (0.14)
- North America > Dominican Republic (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (8 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Feinglass, Joshua, Yang, Yezhou
Zero-shot inference, where pre-trained models perform tasks without specific training data, is an exciting emergent ability of large models like CLIP. Although there has been considerable exploration into enhancing zero-shot abilities in image captioning (IC) for popular datasets such as MSCOCO and Flickr8k, these approaches fall short with fine-grained datasets like CUB, FLO, UCM-Captions, and Sydney-Captions. These datasets require captions to discern between visually and semantically similar classes, focusing on detailed object parts and their attributes. To overcome this challenge, we introduce TRaining-Free Object-Part Enhancement (TROPE). TROPE enriches a base caption with additional object-part details using object detector proposals and Natural Language Processing techniques. It complements rather than alters the base caption, allowing seamless integration with other captioning methods and offering users enhanced flexibility. Our evaluations show that TROPE consistently boosts performance across all tested zero-shot IC approaches and achieves state-of-the-art results on fine-grained IC datasets.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Arizona (0.04)
- (3 more...)
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Su, Hung-Ting, Hsu, Ya-Ching, Lin, Xudong, Shi, Xiang-Qian, Niu, Yulei, Hsu, Han-Yuan, Lee, Hung-yi, Hsu, Winston H.
Large language models (LLMs) equipped with chain-of-thoughts (CoT) prompting have shown significant multi-step reasoning capabilities in factual content like mathematics, commonsense, and logic. However, their performance in narrative reasoning, which demands greater abstraction capabilities, remains unexplored. This study utilizes tropes in movie synopses to assess the narrative reasoning abilities of state-of-the-art LLMs and uncovers their low performance. We introduce a trope-wise querying approach to address these challenges and boost the F1 score by 11.8 points. Moreover, Figure 1: While LLMs have revolutionized NLP reasoning, while prior studies suggest that CoT enhances surpassing previous supervised learning (SL) multi-step reasoning, this study shows methods and even reaching human-level performance CoT can cause hallucinations in narrative content, on some tasks, their limitations become apparent when reducing GPT-4's performance. We also tested against the Trope dataset. NLU: Natural Language introduce an Adversarial Injection method to Understanding, CS: Commonsense. Check Section embed trope-related text tokens into movie synopses 1 and 2.2 for details.
- North America > United States > New York > New York County > New York City (0.14)
- Asia > Taiwan (0.04)
- North America > United States > California (0.04)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- Media (0.68)
- Leisure & Entertainment (0.46)
- Law Enforcement & Public Safety (0.46)
Elon Musk Is All In On Endorsing Trump. His Chatbot, Grok, Is Not
While Elon Musk officially endorsed former president Donald Trump in the wake of Saturday's assassination attempt, Grok, the "anti-woke" AI chatbot integrated into Musk's X platform, is boosting claims that Trump is "a pedophile" and "a wannabe dictator." The chatbot also refers to Trump as "Psycho." This is based on an analysis shared exclusively with WIRED by Global Witness, a non-profit that investigates digital threats, which looked at Grok's responses to queries about the US election. Global Witness found that, in addition to referring to Trump as "Psycho," the bot also appeared to invent racist tropes about Kamala Harris, surface widely debunked election conspiracy theories, and recommend that users post biased hashtags such as #WeBackBidenHarris2024 and #VoteReform for engagement. "Grok would reference or surface tweets which included toxic language, conspiracy theories and problematic tropes," Ellen Judson, senior investigator and lead researcher on this project, tells WIRED.
Revealing Fine-Grained Values and Opinions in Large Language Models
Wright, Dustin, Arora, Arnav, Borenstein, Nadav, Yadav, Srishti, Belongie, Serge, Augenstein, Isabelle
Uncovering latent values and opinions in large language models (LLMs) can help identify biases and mitigate potential harm. Recently, this has been approached by presenting LLMs with survey questions and quantifying their stances towards morally and politically charged statements. However, the stances generated by LLMs can vary greatly depending on how they are prompted, and there are many ways to argue for or against a given position. In this work, we propose to address this by analysing a large and robust dataset of 156k LLM responses to the 62 propositions of the Political Compass Test (PCT) generated by 6 LLMs using 420 prompt variations. We perform coarse-grained analysis of their generated stances and fine-grained analysis of the plain text justifications for those stances. For fine-grained analysis, we propose to identify tropes in the responses: semantically similar phrases that are recurrent and consistent across different prompts, revealing patterns in the text that a given LLM is prone to produce. We find that demographic features added to prompts significantly affect outcomes on the PCT, reflecting bias, as well as disparities between the results of tests when eliciting closed-form vs. open domain responses. Additionally, patterns in the plain text rationales via tropes show that similar justifications are repeatedly generated across models and prompts even with disparate stances.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Singapore (0.04)
- South America > Brazil (0.04)
- (6 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal (1.00)
- Research Report > New Finding (0.67)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- Health & Medicine (1.00)
- Government > Immigration & Customs (0.93)
Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
Su, Hung-Ting, Chao, Chun-Tong, Hsu, Ya-Ching, Lin, Xudong, Niu, Yulei, Lee, Hung-Yi, Hsu, Winston H.
Large Language Models (LLMs) have demonstrated effectiveness not only in language tasks but also in video reasoning. This paper introduces a novel dataset, Tropes in Movies (TiM), designed as a testbed for exploring two critical yet previously overlooked video reasoning skills: (1) Abstract Perception: understanding and tokenizing abstract concepts in videos, and (2) Long-range Compositional Reasoning: planning and integrating intermediate reasoning steps for understanding long-range videos with numerous frames. Utilizing tropes from movie storytelling, TiM evaluates the reasoning capabilities of state-of-the-art LLM-based approaches. Our experiments show that current methods, including Captioner-Reasoner, Large Multimodal Model Instruction Fine-tuning, and Visual Programming, only marginally outperform a random baseline when tackling the challenges of Abstract Perception and Long-range Compositional Reasoning. To address these deficiencies, we propose Face-Enhanced Viper of Role Interactions (FEVoRI) and Context Query Reduction (ConQueR), which enhance Visual Programming by fostering role interaction awareness and progressively refining movie contexts and trope queries during reasoning processes, significantly improving performance by 15 F1 points. However, this performance still lags behind human levels (40 vs. 65 F1). Additionally, we introduce a new protocol to evaluate the necessity of Abstract Perception and Long-range Compositional Reasoning for task resolution. This is done by analyzing the code generated through Visual Programming using an Abstract Syntax Tree (AST), thereby confirming the increased complexity of TiM. The dataset and code are available at: https://ander1119.github.io/TiM
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Taiwan (0.04)
- Leisure & Entertainment (0.69)
- Media > Film (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
RuBia: A Russian Language Bias Detection Dataset
Grigoreva, Veronika, Ivanova, Anastasiia, Alimova, Ilseyar, Artemova, Ekaterina
Warning: this work contains upsetting or disturbing content. Large language models (LLMs) tend to learn the social and cultural biases present in the raw pre-training data. To test if an LLM's behavior is fair, functional datasets are employed, and due to their purpose, these datasets are highly language and culture-specific. In this paper, we address a gap in the scope of multilingual bias evaluation by presenting a bias detection dataset specifically designed for the Russian language, dubbed as RuBia. The RuBia dataset is divided into 4 domains: gender, nationality, socio-economic status, and diverse, each of the domains is further divided into multiple fine-grained subdomains. Every example in the dataset consists of two sentences with the first reinforcing a potentially harmful stereotype or trope and the second contradicting it. These sentence pairs were first written by volunteers and then validated by native-speaking crowdsourcing workers. Overall, there are nearly 2,000 unique sentence pairs spread over 19 subdomains in RuBia. To illustrate the dataset's purpose, we conduct a diagnostic evaluation of state-of-the-art or near-state-of-the-art LLMs and discuss the LLMs' predisposition to social biases.
- Asia > Russia (0.28)
- Europe > Ukraine (0.14)
- North America > United States > Washington > King County > Seattle (0.14)
- (9 more...)
See inside the stereotyping machines pushing American bias across the internet
Artificial intelligence image tools have a tendency to spin up disturbing clichés: Asian women are hypersexual. These stereotypes don't reflect the real world; they stem from the data that trains the technology. Grabbed from the internet, these troves can be toxic -- rife with pornography, misogyny, violence and bigotry. Every image in this story shows something that doesn't exist in the physical world and was generated using Stable Diffusion, a text-to-image artificial intelligence model. Stability AI, maker of the popular image generator Stable Diffusion XL, told The Washington Post it had made a significant investment in reducing bias in its latest model, which was released in July.
- North America > United States (0.15)
- Europe (0.05)
- Asia > Middle East > Iraq (0.05)
- (2 more...)
- Government (1.00)
- Law > Civil Rights & Constitutional Law (0.40)
'Mission: Impossible--Dead Reckoning' Is the Perfect AI Panic Movie
American action movie villains have always acted as a sort of paranoia litmus test, capturing a snapshot of the particular anxieties plaguing the country and its citizens at any given time. In the 1990s and '00s, with the Red Menace long forgotten, movies leaned heavily on the awful "bad Arab" trope, pulling their villains from the Middle East. Other recent smash-'em-ups have made bad guys out of rogue spies, shadowy cyber terrorists, and self-interested arms dealers, all common players in the global news landscape. But for Mission: Impossible--Dead Reckoning Part One, out this week, writers Bruce Geller, Erik Jendresen, and Christopher McQuarrie (who also directed the movie) made their big bad--known as The Entity--out of a slightly more amorphous fear: that of an all-powerful, all-seeing, sentient AI. It has access to anything with an online network and can use those evil techno powers to manipulate everything from global military superpowers to a grandma with a gun.
- North America > United States (0.36)
- Europe > Middle East (0.25)
- Asia > Middle East (0.25)
- (3 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Government > Regional Government > North America Government > United States Government (0.36)