Goto

Collaborating Authors

 mafia


CSP4SDG: Constraint and Information-Theory Based Role Identification in Social Deduction Games with LLM-Enhanced Inference

Xu, Kaijie, Meng, Fandi, Verbrugge, Clark, Lucas, Simon

arXiv.org Artificial Intelligence

In Social Deduction Games (SDGs) such as Avalon, Mafia, and W erewolf, players conceal their identities and deliberately mislead others, making hidden-role inference a central and demanding task. Accurate role identification, which forms the basis of an agent's belief state, is therefore the keystone for both human and AI performance. We introduce CSP4SDG, a probabilistic, constraint-satisfaction framework that analyses gameplay objectively. Game events and dialogue are mapped to four linguistically-agnostic constraint classes--evidence, phenomena, assertions, and hypotheses. Hard constraints prune impossible role assignments, while weighted soft constraints score the remainder; information-gain weighting links each hypothesis to its expected value under entropy reduction, and a simple closed-form scoring rule guarantees that truthful assertions converge to classical hard logic with minimum error. The resulting posterior over roles is fully interpretable and updates in real time. Experiments on three public datasets show that CSP4SDG (i) outperforms LLM-based baselines in every inference scenario, and (ii) boosts LLMs when supplied as an auxiliary "reasoning tool." Our study validates that principled probabilistic reasoning with information theory is a scalable alternative--or complement--to heavy-weight neural models for SDGs.


Scaling Laws For Scalable Oversight

Engels, Joshua, Baek, David D., Kantamneni, Subhash, Tegmark, Max

arXiv.org Artificial Intelligence

Scalable oversight, the process by which weaker AI systems supervise stronger ones, has been proposed as a key strategy to control future superintelligent systems. However, it is still unclear how scalable oversight itself scales. To address this gap, we propose a framework that quantifies the probability of successful oversight as a function of the capabilities of the overseer and the system being overseen. Specifically, our framework models oversight as a game between capability-mismatched players; the players have oversight-specific Elo scores that are a piecewise-linear function of their general intelligence, with two plateaus corresponding to task incompetence and task saturation. We validate our framework with a modified version of the game Nim and then apply it to four oversight games: Mafia, Debate, Backdoor Code and Wargames. For each game, we find scaling laws that approximate how domain performance depends on general AI system capability. We then build on our findings in a theoretical study of Nested Scalable Oversight (NSO), a process in which trusted models oversee untrusted stronger models, which then become the trusted models in the next step. We identify conditions under which NSO succeeds and derive numerically (and in some cases analytically) the optimal number of oversight levels to maximize the probability of oversight success. We also apply our theory to our four oversight games, where we find that NSO success rates at a general Elo gap of 400 are 13.5% for Mafia, 51.7% for Debate, 10.0% for Backdoor Code, and 9.4% for Wargames; these rates decline further when overseeing stronger systems.


Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games

Eckhaus, Niv, Berger, Uri, Stanovsky, Gabriel

arXiv.org Artificial Intelligence

LLMs are used predominantly in synchronous communication, where a human user and a model communicate in alternating turns. In contrast, many real-world settings are asynchronous. For example, in group chats, online team meetings, or social games, there is no inherent notion of turns. In this work, we develop an adaptive asynchronous LLM agent consisting of two modules: a generator that decides what to say, and a scheduler that decides when to say it. To evaluate our agent, we collect a unique dataset of online Mafia games, where our agent plays with human participants. Overall, our agent performs on par with human players, both in game performance metrics and in its ability to blend in with the other human players. Our analysis shows that the agent's behavior in deciding when to speak closely mirrors human patterns, although differences emerge in message content. We make all of our code and data publicly available. This work paves the way for integration of LLMs into realistic human group settings, from assistance in team discussions to educational and professional environments where complex social dynamics must be navigated.


The Witcher IV, Ōkami 2 and other big reveals from the Game awards

The Guardian

Alongside some worthy winners – Balatro, Astro Bot and Metaphor: ReFantazio swept the board – the Game awards last Thursday brought a generous bounty of end-of-year announcements, like unexpected gifts under the tree. In terms of newsworthy reveals, it was the best show yet: it felt a bit like an old-school E3 conference. If you were, quite understandably, not watching a three-hour video game awards show live from LA that aired after midnight UK time, here's what's worth knowing about. We've known that another dark-fantasy RPG has been in development in Poland at CD Projekt for some time, but now we've seen it. The next Witcher game stars white-haired warrior badass Ciri, instead of her sort-of-father-figure Geralt, and the trailer shows her locked in combat with an impressively gruesome monster.


Seven things we learned from Gamescom opening night

BBC News

It has been a year with no major new console launches and where the industry has seen strikes and cuts with thousands of workers being laid off. The opening night of Gamescom is often an opportunity for a big shiny night to get fans all excited for the year ahead. Setting the stage for the next 12 months, here are the biggest things we found out from Europe's biggest gaming show in Germany. In a year when games became films, and films became games, the convention centre in Cologne saw a night all about the big trailers. This year, Borderlands has taken attention for its movie adaptation starring Cate Blanchett and Kevin Hart. That film received some of the year's harshest reviews, but that has not scuppered plans for a new game in the mainline series.


ChatGPT, Tech Map, Capital Story: Unveiling the Mystery Boss

#artificialintelligence

OpenAI, the company behind ChatGPT, has become the fastest-growing consumer application in history. With more than 30 executives, engineers, and researchers leaving the company to start their own companies, OpenAI has raised over US$1 billion in financing and created the "OpenAI Mafia", a powerful network of talent, social connections, and capital opportunities. This new generation of AI companies is driving a new round of technological frenzy and investment opportunities, and OpenAI is dedicated to helping humans realize their beautiful vision with an elite team. The OpenAI Mafia is the new generation of AI companies founded by OpenAI employees in the past five years, and is set to revolutionize the AI industry and shape the future of AI technology. Anthropic is an AI company founded in 2021 by Dario and Daniela Amodei, former vice presidents of OpenAI.


Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia

Ibraheem, Samee, Zhou, Gaoyue, DeNero, John

arXiv.org Artificial Intelligence

While neural networks demonstrate a remarkable ability to model linguistic content, capturing contextual information related to a speaker's conversational role is an open area of research. In this work, we analyze the effect of speaker role on language use through the game of Mafia, in which participants are assigned either an honest or a deceptive role. In addition to building a framework to collect a dataset of Mafia game records, we demonstrate that there are differences in the language produced by players with different roles. We confirm that classification models are able to rank deceptive players as more suspicious than honest ones based only on their use of language. Furthermore, we show that training models on two auxiliary tasks outperforms a standard BERT-based text classification approach. We also present methods for using our trained models to identify features that distinguish between player roles, which could be used to assist players during the Mafia game.


Tech News: The Artificial Intelligence "Mafia"

#artificialintelligence

The "Big Nine" is a reference to the nine biggest technology companies who plays a dominant role with regard to AI. Photo: Pixabay …


1VttwfB

#artificialintelligence

It's no longer up for debate that AI is set to have a major impact on most businesses, if it isn't already--and any company that wants to stay ahead must figure out how to integrate the new technology into its structure. But how is a successful AI platform built? How the'PayPal Mafia' redefined success in Silicon Valley A decade ago, the PayPal Mafia played a major role in revitalizing the tech industry in Silicon Valley. The story behind this group of leaders proves that their success is more than just luck. In Mehanna's session, he explained how Facebook developed its own machine learning platform, and how Facebook employees are using it.