AITopics | avalon

Finding Friend and Foe in Multi-Agent Games

Neural Information Processing SystemsDec-25-2025, 17:28:27 GMT

Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on The Resistance: Avalon, the most popular hidden role game. DeepRole combines counterfactual regret minimization (CFR) with deep value networks trained through self-play.

artificial intelligence, machine learning, reinforcement learning, (9 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds

Neural Information Processing SystemsDec-24-2025, 05:18:47 GMT

Despite impressive successes, deep reinforcement learning (RL) systems still fall short of human performance on generalization to new tasks and environments that differ from their training. As a benchmark tailored for studying RL generalization, we introduce Avalon, a set of tasks in which embodied agents in highly diverse procedural 3D worlds must survive by navigating terrain, hunting or gathering food, and avoiding hazards. Avalon is unique among existing RL benchmarks in that the reward function, world dynamics, and action space are the same for every task, with tasks differentiated solely by altering the environment; its 20 tasks, ranging in complexity from eat and throw to hunt and navigate, each create worlds in which the agent must perform specific skills in order to survive. This setup enables investigations of generalization within tasks, between tasks, and to compositional tasks that require combining skills learned from previous tasks. Avalon includes a highly efficient simulator, a library of baselines, and a benchmark with scoring metrics evaluated against hundreds of hours of human performance, all of which are open-source and publicly available. We find that standard RL baselines make progress on most tasks but are still far from human performance, suggesting Avalon is challenging enough to advance the quest for generalizable RL.

artificial intelligence, machine learning, reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Add feedback

Finding Friend and Foe in Multi-Agent Games

Jack Serrino, Max Kleiman-Weiner, David C. Parkes, Josh Tenenbaum

Neural Information Processing SystemsNov-17-2025, 19:29:37 GMT

Neural Information Processing Systems http://nips.cc/

deeprole, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas (0.04)
North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

912d2b1c7b2826caf99687388d2e8f7c-AuthorFeedback.pdf

Neural Information Processing SystemsNov-17-2025, 19:29:21 GMT

We thank all three reviewers for their comments and insightful suggestions. We outline some of these changes here. Our approach uses CFR instead of MCTS. We've added the following sentence: "Compared to Does the proposed method generalize to other games such as werewolf or saboteur? . . . DeepRole could be applied directly to Saboteur. We mention in the discussion: "In future Need ablation and analysis -- we all know trained agents are vulnerable to adversarial human players -- e.g. the Another interesting observation is the bot does not need conversation.

agent, artificial intelligence, deeprole, (9 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback

539f1f7dd156cfe1222b0be83f247d35-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsNov-14-2025, 07:02:11 GMT

avalon, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Sweden > Skåne County > Malmö (0.04)

Industry:

Education (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

CSP4SDG: Constraint and Information-Theory Based Role Identification in Social Deduction Games with LLM-Enhanced Inference

Xu, Kaijie, Meng, Fandi, Verbrugge, Clark, Lucas, Simon

arXiv.org Artificial IntelligenceNov-11-2025

In Social Deduction Games (SDGs) such as Avalon, Mafia, and W erewolf, players conceal their identities and deliberately mislead others, making hidden-role inference a central and demanding task. Accurate role identification, which forms the basis of an agent's belief state, is therefore the keystone for both human and AI performance. We introduce CSP4SDG, a probabilistic, constraint-satisfaction framework that analyses gameplay objectively. Game events and dialogue are mapped to four linguistically-agnostic constraint classes--evidence, phenomena, assertions, and hypotheses. Hard constraints prune impossible role assignments, while weighted soft constraints score the remainder; information-gain weighting links each hypothesis to its expected value under entropy reduction, and a simple closed-form scoring rule guarantees that truthful assertions converge to classical hard logic with minimum error. The resulting posterior over roles is fully interpretable and updates in real time. Experiments on three public datasets show that CSP4SDG (i) outperforms LLM-based baselines in every inference scenario, and (ii) boosts LLMs when supplied as an auxiliary "reasoning tool." Our study validates that principled probabilistic reasoning with information theory is a scalable alternative--or complement--to heavy-weight neural models for SDGs.

constraint, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.06175

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Hawaii (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Finding Friend and Foe in Multi-Agent Games

Jack Serrino, Max Kleiman-Weiner, David C. Parkes, Josh Tenenbaum

Neural Information Processing SystemsOct-3-2025, 05:41:54 GMT

Neural Information Processing Systems http://nips.cc/

deeprole, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

912d2b1c7b2826caf99687388d2e8f7c-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 05:41:40 GMT

agent, artificial intelligence, deeprole, (9 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.80)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback

539f1f7dd156cfe1222b0be83f247d35-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-14-2025, 21:17:39 GMT

avalon, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Sweden > Skåne County > Malmö (0.04)

Industry:

Education (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds

Neural Information Processing SystemsOct-11-2024, 02:09:56 GMT

Despite impressive successes, deep reinforcement learning (RL) systems still fall short of human performance on generalization to new tasks and environments that differ from their training. As a benchmark tailored for studying RL generalization, we introduce Avalon, a set of tasks in which embodied agents in highly diverse procedural 3D worlds must survive by navigating terrain, hunting or gathering food, and avoiding hazards. Avalon is unique among existing RL benchmarks in that the reward function, world dynamics, and action space are the same for every task, with tasks differentiated solely by altering the environment; its 20 tasks, ranging in complexity from eat and throw to hunt and navigate, each create worlds in which the agent must perform specific skills in order to survive. This setup enables investigations of generalization within tasks, between tasks, and to compositional tasks that require combining skills learned from previous tasks. Avalon includes a highly efficient simulator, a library of baselines, and a benchmark with scoring metrics evaluated against hundreds of hours of human performance, all of which are open-source and publicly available.

avalon, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Collaborating Authors

avalon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Finding Friend and Foe in Multi-Agent Games

Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds

Finding Friend and Foe in Multi-Agent Games

912d2b1c7b2826caf99687388d2e8f7c-AuthorFeedback.pdf

539f1f7dd156cfe1222b0be83f247d35-Paper-Datasets_and_Benchmarks.pdf

CSP4SDG: Constraint and Information-Theory Based Role Identification in Social Deduction Games with LLM-Enhanced Inference

Finding Friend and Foe in Multi-Agent Games

912d2b1c7b2826caf99687388d2e8f7c-AuthorFeedback.pdf

539f1f7dd156cfe1222b0be83f247d35-Paper-Datasets_and_Benchmarks.pdf

Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds