storyteller
Applying Large Language Models to Characterize Public Narratives
Poole-Dayan, Elinor, Kessler, Daniel T, Chiou, Hannah, Hughes, Margaret, Lin, Emily S, Ganz, Marshall, Roy, Deb
Public Narratives (PNs) are key tools for leadership development and civic mobilization, yet their systematic analysis remains challenging due to their subjective interpretation and the high cost of expert annotation. In this work, we propose a novel computational framework that leverages large language models (LLMs) to automate the qualitative annotation of public narratives. Using a codebook we co-developed with subject-matter experts, we evaluate LLM performance against that of expert annotators. Our work reveals that LLMs can achieve near-human-expert performance, achieving an average F1 score of 0.80 across 8 narratives and 14 codes. We then extend our analysis to empirically explore how PN framework elements manifest across a larger dataset of 22 stories. Lastly, we extrapolate our analysis to a set of political speeches, establishing a novel lens in which to analyze political rhetoric in civic spaces. This study demonstrates the potential of LLM-assisted annotation for scalable narrative analysis and highlights key limitations and directions for future research in computational civic storytelling.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Africa > Cameroon > Gulf of Guinea (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (8 more...)
SynthWorlds: Controlled Parallel Worlds for Disentangling Reasoning and Knowledge in Language Models
Gu, Ken, Bhat, Advait, Merrill, Mike A, West, Robert, Liu, Xin, McDuff, Daniel, Althoff, Tim
Evaluating the reasoning ability of language models (LMs) is complicated by their extensive parametric world knowledge, where benchmark performance often reflects factual recall rather than genuine reasoning. Existing datasets and approaches (e.g., temporal filtering, paraphrasing, adversarial substitution) cannot cleanly separate the two. We present SynthWorlds, a framework that disentangles task reasoning complexity from factual knowledge. In SynthWorlds, we construct parallel corpora representing two worlds with identical interconnected structure: a real-mapped world, where models may exploit parametric knowledge, and a synthetic-mapped world, where such knowledge is meaningless. On top of these corpora, we design two mirrored tasks as case studies: multi-hop question answering and page navigation, which maintain equal reasoning difficulty across worlds. Experiments in parametric-only (e.g., closed-book QA) and knowledge-augmented (e.g., retrieval-augmented) LM settings reveal a persistent knowledge advantage gap, defined as the performance boost models gain from memorized parametric world knowledge. Knowledge acquisition and integration mechanisms reduce but do not eliminate this gap, highlighting opportunities for system improvements. Fully automatic and scalable, SynthWorlds provides a controlled environment for evaluating LMs in ways that were previously challenging, enabling precise and testable comparisons of reasoning and memorization.
- Research Report > New Finding (1.00)
- Personal > Honors (1.00)
- Research Report > Experimental Study (0.67)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Government > Regional Government (0.93)
- Automobiles & Trucks (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Can They Dixit? Yes they Can! Dixit as a Playground for Multimodal Language Model Capabilities
Balepur, Nishant, Nguyen, Dang, Ki, Dayeon
Multi-modal large language models (MLMs) are often assessed on static, individual benchmarks -- which cannot jointly assess MLM capabilities in a single task -- or rely on human or model pairwise comparisons -- which is highly subjective, expensive, and allows models to exploit superficial shortcuts (e.g., verbosity) to inflate their win-rates. To overcome these issues, we propose game-based evaluations to holistically assess MLM capabilities. Games require multiple abilities for players to win, are inherently competitive, and are governed by fix, objective rules, and makes evaluation more engaging, providing a robust framework to address the aforementioned challenges. We manifest this evaluation specifically through Dixit, a fantasy card game where players must generate captions for a card that trick some, but not all players, into selecting the played card. Our quantitative experiments with five MLMs show Dixit win-rate rankings are perfectly correlated with those on popular MLM benchmarks, while games between human and MLM players in Dixit reveal several differences between agent strategies and areas of improvement for MLM reasoning.
- North America > United States > Maryland (0.04)
- North America > Dominican Republic (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Middle East > Jordan (0.04)
DixitWorld: Evaluating Multimodal Abductive Reasoning in Vision-Language Models with Multi-Agent Dixit Gameplay
Mo, Yunxiang, Zheng, Tianshi, Zong, Qing, Liu, Jiayu, Xu, Baixuan, Yim, Yauwai, Chan, Chunkit, Bai, Jiaxin, Song, Yangqiu
Multimodal abductive reasoning--the generation and selection of explanatory hypotheses from partial observations--is a cornerstone of intelligence. Current evaluations of this ability in vision-language models (VLMs) are largely confined to static, single-agent tasks. Inspired by Dixit, we introduce DixitWorld, a comprehensive evaluation suite designed to deconstruct this challenge. DIXITWORLD features two core components: DixitArena, a dynamic, multi-agent environment that evaluates both hypothesis generation (a "storyteller" crafting cryptic clues) and hypothesis selection ("listeners" choosing the target image from decoys) under imperfect information; and DixitBench, a static QA benchmark that isolates the listener's task for efficient, controlled evaluation. Results from DixitArena reveal distinct, role-dependent behaviors: smaller open-source models often excel as creative storytellers, producing imaginative yet less discriminative clues, whereas larger proprietary models demonstrate superior overall performance, particularly as listeners. Performance on DixitBench strongly correlates with listener results in DixitArena, validating it as a reliable proxy for hypothesis selection. Our findings reveal a key trade-off between generative creativity and discriminative understanding in multimodal abductive reasoning, a central challenge for developing more balanced and capable vision-language agents.
- North America > United States (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (0.83)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)
From Image Captioning to Visual Storytelling
Passadakis, Admitos, Song, Yingjin, Gatt, Albert
Visual Storytelling is a challenging multimodal task between Vision & Language, where the purpose is to generate a story for a stream of images. Its difficulty lies on the fact that the story should be both grounded to the image sequence but also narrative and coherent. The aim of this work is to balance between these aspects, by treating Visual Storytelling as a superset of Image Captioning, an approach quite different compared to most of prior relevant studies. This means that we firstly employ a vision-to-language model for obtaining captions of the input images, and then, these captions are transformed into coherent narratives using language-to-language methods. Our multifarious evaluation shows that integrating captioning and storytelling under a unified framework, has a positive impact on the quality of the produced stories. In addition, compared to numerous previous studies, this approach accelerates training time and makes our framework readily reusable and reproducible by anyone interested. Lastly, we propose a new metric/tool, named ideality, that can be used to simulate how far some results are from an oracle model, and we apply it to emulate human-likeness in visual storytelling.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- (5 more...)
Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling
Messner, Craig, Lippincott, Tom
We present an ngram model-based logit scaling technique that effectively transfers extreme subword stylistic variation to large language models at inference time. We demonstrate its efficacy by tracking the perplexity of generated text with respect to the ngram interpolated and original versions of an evaluation model. Minimizing the former measure while the latter approaches the perplexity of a text produced by a target author or character lets us select a sufficient degree of adaptation while retaining fluency.
- North America > United States (0.28)
- Oceania > Australia (0.14)
- North America > Mexico > Mexico City (0.14)
- (2 more...)
Over 300 video game actors protest over unregulated AI use in Hollywood
More than 300 video game performers and Hollywood actors picketed in front of the Warner Bros Studios building on Thursday to protest against what they call an unwillingness from top gaming companies to protect union voice actors and motion capture workers equally against the unregulated use of artificial intelligence. Standing before the crowd, Duncan Crabtree-Ireland, national executive director of the Screen Actors Guild-American Federation of Television and Radio Artists (Sag-Aftra), said that AI has become the most challenging issue in many of the union's negotiations. "We've made deals with the studios and streamers. We've made deals without a strike with the major record labels and with countless other employers, which provide for informed consent and fair compensation for our members," he told the Associated Press. "And yet, for some reason, the video game companies refuse to do that and that's what's going to be their undoing."
- Leisure & Entertainment > Games > Computer Games (1.00)
- Media > Film (0.94)
CogNarr Ecosystem: Facilitating Group Cognition at Scale
Human groups of all sizes and kinds engage in deliberation, problem solving, strategizing, decision making, and more generally, cognition. Some groups are large, and that setting presents unique challenges. The small-group setting often involves face-to-face dialogue, but group cognition in the large-group setting typically requires some form of online interaction. New approaches are needed to facilitate the kind of rich communication and information processing that are required for effective, functional cognition in the online setting, especially for groups characterized by thousands to millions of participants who wish to share potentially complex, nuanced, and dynamic perspectives. This concept paper proposes the CogNarr (Cognitive Narrative) ecosystem, which is designed to facilitate functional cognition in the large-group setting. The paper's contribution is a novel vision as to how recent developments in cognitive science, artificial intelligence, natural language processing, and related fields might be scaled and applied to large-group cognition, using an approach that itself promotes further scientific advancement. A key perspective is to view a group as an organism that uses some form of cognitive architecture to sense the world, process information, remember, learn, predict, make decisions, and adapt to changing conditions. The CogNarr ecosystem is designed to serve as a component within that architecture.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (11 more...)
- Research Report (0.81)
- Overview (0.67)
'Storyteller' is the latest hot indie game coming to Netflix
Storyteller is a game about writing, word puzzles and the twisted tales we tell ourselves just to get through the day, and it'll be playable on Android and iOS via Netflix on September 26th. Storyteller is published by Annapurna Interactive and it landed on Switch and PC on March 23rd -- after spending more than a decade in development. Solo creator Daniel Benmergui announced Storyteller in 2011, and a prototype of the game actually won the Nuovo award for innovation at the Game Developers Conference in 2012. After that, life happened and Benmergui stopped working on Storyteller for a few years, but he eventually picked it back up and found a publishing partner in Annapurna. When Storyteller lands on iOS and Android in September, it'll come with free DLC that offers new stories for players to weave.
- Media > Television (0.76)
- Media > Film (0.76)
- Information Technology > Services (0.76)
- Leisure & Entertainment > Games > Computer Games (0.76)
- Indian Ocean > Red Sea (0.08)
- Asia > Middle East > Yemen (0.08)
- Asia > Middle East > Saudi Arabia (0.08)
- (6 more...)
- Leisure & Entertainment (0.53)
- Media > Film (0.33)