AITopics | villager

Collaborating Authors

villager

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi agent Enhancing Strategic Interactions of Large Language Model in Language Game

Neural Information Processing SystemsJun-20-2026, 15:39:51 GMT

Achieving Artificial General Intelligence (AGI) requires AI agents that can not only make strategic decisions but also engage in flexible and meaningful communication. Inspired by Wittgenstein's language game theory, we propose that language agents can learn through in-context interaction rather than traditional multi-stage frameworks that separate decision-making from language expression. Using Werewolf, a social deduction game that tests language understanding, strategic interaction, and adaptability, as a test bed, we develop the Multi-agent Kahneman-Tversky's Optimization (MaKTO). MaKTO engages diverse models in extensive gameplay to generate unpaired desirable and unacceptable responses, then employs KTO to refine the model's decision-making process. In 9-player Werewolf games, MaKTO achieves a 61% average win rate across various models, outperforming GPT-4o and two-stage RL agents by relative improvements of 23.0% and 10.9%, respectively. Notably, MaKTO also demonstrates human-like performance, winning 60% against expert players and showing only 48.9% detectability in Turing-style blind tests.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

WOLF: Werewolf-based Observations for LLM Deception and Falsehoods

Agarwal, Mrinal, Rana, Saad, Sundoro, Theo, Berhe, Hermela, Kim, Spencer, Sharma, Vasu, O'Brien, Sean, Zhu, Kevin

arXiv.org Artificial IntelligenceDec-11-2025

Deception is a fundamental challenge for multi-agent reasoning: effective systems must strategically conceal information while detecting misleading behavior in others. Yet most evaluations reduce deception to static classification, ignoring the interactive, adversarial, and longitudinal nature of real deceptive dynamics. Large language models (LLMs) can deceive convincingly but remain weak at detecting deception in peers. We present WOLF, a multi-agent social deduction benchmark based on Werewolf that enables separable measurement of deception production and detection. WOLF embeds role-grounded agents (Villager, Werewolf, Seer, Doctor) in a programmable LangGraph state machine with strict night-day cycles, debate turns, and majority voting. Every statement is a distinct analysis unit, with self-assessed honesty from speakers and peer-rated deceptiveness from others. Deception is categorized via a standardized taxonomy (omission, distortion, fabrication, misdirection), while suspicion scores are longitudinally smoothed to capture both immediate judgments and evolving trust dynamics. Structured logs preserve prompts, outputs, and state transitions for full reproducibility. Across 7,320 statements and 100 runs, Werewolves produce deceptive statements in 31% of turns, while peer detection achieves 71-73% precision with ~52% overall accuracy. Precision is higher for identifying Werewolves, though false positives occur against Villagers. Suspicion toward Werewolves rises from ~52% to over 60% across rounds, while suspicion toward Villagers and the Doctor stabilizes near 44-46%. This divergence shows that extended interaction improves recall against liars without compounding errors against truthful roles. WOLF moves deception evaluation beyond static datasets, offering a dynamic, controlled testbed for measuring deceptive and detective capacity in adversarial multi-agent interaction.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.09187

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

To unearth their past, Amazonian people turn to 'a language white men understand'

ScienceNov-6-2025, 19:00:00 GMT

The site, a few kilometers from her own hut in Ipatsé, a Kuikuro village in the Xingu Indigenous territory, was once the backyard of her great-grandparents' house. As she scrapes the brown earth with a trowel, she soon spots a black ceramic shard. It is only about the size of her palm, and this is her first day ever on an archaeological excavation. But she immediately recognizes what the object once was. "It's an alato," she says, showing the piece to a group of archaeologists and other Kuikuro who have gathered to watch the excavation in the village of Anitahagu. An alato, Yamána explains, is a large pan used to cook beiju, a white flatbread made with yucca flour that's eaten almost every day in her village. Her grandmother still has one in the backyard fire pit where she prepares most meals, just as countless Kuikuro women did before her. This alato likely belonged to her great-grandmother on her mother's side.

archaeology, heckenberger, kuikuro, (17 more...)

Science

Country:

South America > Ecuador (0.04)
South America > Brazil > São Paulo (0.04)
South America > Brazil > Santa Catarina (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Industry:

Education (1.00)
Government (0.94)
Energy (0.68)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

Scaling Laws For Scalable Oversight

Engels, Joshua, Baek, David D., Kantamneni, Subhash, Tegmark, Max

arXiv.org Artificial IntelligenceOct-28-2025

Scalable oversight, the process by which weaker AI systems supervise stronger ones, has been proposed as a key strategy to control future superintelligent systems. However, it is still unclear how scalable oversight itself scales. To address this gap, we propose a framework that quantifies the probability of successful oversight as a function of the capabilities of the overseer and the system being overseen. Specifically, our framework models oversight as a game between capability-mismatched players; the players have oversight-specific Elo scores that are a piecewise-linear function of their general intelligence, with two plateaus corresponding to task incompetence and task saturation. We validate our framework with a modified version of the game Nim and then apply it to four oversight games: Mafia, Debate, Backdoor Code and Wargames. For each game, we find scaling laws that approximate how domain performance depends on general AI system capability. We then build on our findings in a theoretical study of Nested Scalable Oversight (NSO), a process in which trusted models oversee untrusted stronger models, which then become the trusted models in the next step. We identify conditions under which NSO succeeds and derive numerically (and in some cases analytically) the optimal number of oversight levels to maximize the probability of oversight success. We also apply our theory to our four oversight games, where we find that NSO success rates at a general Elo gap of 400 are 13.5% for Mafia, 51.7% for Debate, 10.0% for Backdoor Code, and 9.4% for Wargames; these rates decline further when overseeing stronger systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.1853

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment > Games (1.00)
Government > Regional Government > North America Government > United States Government (0.67)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Leading the Follower: Learning Persuasive Agents in Social Deduction Games

Zheng, Zhang, Ye, Deheng, Zhao, Peilin, Wang, Hao

arXiv.org Artificial IntelligenceOct-13-2025

Large language model (LLM) agents have shown remarkable progress in social deduction games (SDGs). However, existing approaches primarily focus on information processing and strategy selection, overlooking the significance of persuasive communication in influencing other players' beliefs and responses. In SDGs, success depends not only on making correct deductions but on convincing others to response in alignment with one's intent. To address this limitation, we formalize turn-based dialogue in SDGs as a Stackelberg competition, where the current player acts as the leader who strategically influences the follower's response. Building on this theoretical foundation, we propose a reinforcement learning framework that trains agents to optimize utterances for persuasive impact. Through comprehensive experiments across three diverse SDGs, we demonstrate that our agents significantly outperform baselines. This work represents a significant step toward developing AI agents capable of strategic social influence, with implications extending to scenarios requiring persuasive communication.

large language model, machine learning, utterance, (23 more...)

arXiv.org Artificial Intelligence

2510.09087

Country:

North America > United States (0.46)
Europe > Austria (0.28)
Asia > China (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Deceive, Detect, and Disclose: Large Language Models Play Mini-Mafia

Costa, Davi Bastos, Vicente, Renato

arXiv.org Artificial IntelligenceSep-30-2025

Mafia is a social deduction game where informed mafia compete against uninformed townsfolk. Its asymmetry of information and reliance on theory-of-mind reasoning mirror real-world multi-agent scenarios, making it a useful testbed for evaluating the social intelligence of large language models (LLMs). To support a systematic study, we introduce Mini-Mafia: a simplified four-player variant with one mafioso, one detective, and two villagers. We set the mafioso to kill a villager and the detective to investigate the mafioso during the night, reducing the game to a single day phase of discussion and voting. This setup isolates three interactive capabilities through role-specific win conditions: the mafioso must deceive, the villagers must detect deception, and the detective must effectively disclose information. To measure these skills, we have LLMs play against each other, creating the Mini-Mafia Benchmark: a two-stage framework that first estimates win rates within fixed opponent configurations, then aggregates performance across them using standardized scoring. Built entirely from model interactions without external data, the benchmark evolves as new models are introduced, with each one serving both as a new opponent and as a subject of evaluation. Our experiments reveal counterintuitive results, including cases where smaller models outperform larger ones. Beyond benchmarking, Mini-Mafia enables quantitative study of emergent multi-agent dynamics such as name bias and last-speaker advantage. It also contributes to AI safety by generating training data for deception detectors and by tracking models' deception capabilities against human baselines.

background, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.23023

Genre: Research Report (1.00)

Add feedback

Ethical Considerations of Large Language Models in Game Playing

Zhang, Qingquan, Li, Yuchen, Yuan, Bo, Togelius, Julian, Yannakakis, Georgios N., Liu, Jialin

arXiv.org Artificial IntelligenceAug-25-2025

Large language models (LLMs) have demonstrated tremendous potential in game playing, while little attention has been paid to their ethical implications in those contexts. This work investigates and analyses the ethical considerations of applying LLMs in game playing, using Werewolf, also known as Mafia, as a case study. Gender bias, which affects game fairness and player experience, has been observed from the behaviour of LLMs. Some roles, such as the Guard and Werewolf, are more sensitive than others to gender information, presented as a higher degree of behavioural change. We further examine scenarios in which gender information is implicitly conveyed through names, revealing that LLMs still exhibit discriminatory tendencies even in the absence of explicit gender labels. This research showcases the importance of developing fair and ethical LLMs. Beyond our research findings, we discuss the challenges and opportunities that lie ahead in this field, emphasising the need for diving deeper into the ethical implications of LLMs in gaming and other interactive domains.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11704-025-50136-2

2508.16065

Country:

North America > United States (0.46)
Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.93)
Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards

Liu, Cheng, Lu, Yifei, Ye, Fanghua, Li, Jian, Chen, Xingyu, Ren, Feiliang, Tu, Zhaopeng, Li, Xiaolong

arXiv.org Artificial IntelligenceJul-24-2025

Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs). Existing approaches typically rely on prompt engineering or supervised fine-tuning to enable models to imitate character behaviors in specific scenarios, but often neglect the underlying \emph{cognitive} mechanisms driving these behaviors. Inspired by cognitive psychology, we introduce \textbf{CogDual}, a novel RPLA adopting a \textit{cognize-then-respond } reasoning paradigm. By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment. To further optimize the performance, we employ reinforcement learning with two general-purpose reward schemes designed for open-domain text generation. Extensive experiments on the CoSER benchmark, as well as Cross-MR and LifeChoice, demonstrate that CogDual consistently outperforms existing baselines and generalizes effectively across diverse role-playing tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.17147

Country: Asia > China (0.46)

Genre:

Research Report (1.00)
Personal > Interview (0.46)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Strategy Adaptation in Large Language Model Werewolf Agents

Nakamori, Fuya, Huang, Yin Jou, Cheng, Fei

arXiv.org Artificial IntelligenceJul-18-2025

This study proposes a method to improve the performance of Werewolf agents by switching between predefined strategies based on the attitudes of other players and the context of conversations. While prior works of Werewolf agents using prompt engineering have employed methods where effective strategies are implicitly defined, they cannot adapt to changing situations. In this research, we propose a method that explicitly selects an appropriate strategy based on the game context and the estimated roles of other players. We compare the strategy adaptation Werewolf agents with baseline agents using implicit or fixed strategies and verify the effectiveness of our proposed method.

attack strategy, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.12732

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A Minecraft Movie review: It's good, actually

EngadgetApr-2-2025, 19:00:35 GMT

I too rolled my eyes when A Minecraft Movie was announced. We're all tired of seeing Jack Black in video game movies -- he was fine in Super Mario Bros., but good god Borderlands was a disaster. And the Minecraft film's trailers did it no favors, another soulless movie produced on a virtual set about a game that's completely open-ended and plotless. But it turns out A Minecraft Movie is actually good. Honestly, I'm as surprised as you are.

jack black, minecraft movie, minecraft movie review, (7 more...)

Engadget

Industry:

Media > Film (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (1.00)

Add feedback