AITopics | no-press diplomacy

Collaborating Authors

no-press diplomacy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

95f2b84de5660ddf45c8a34933a2e66f-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 02:19:59 GMT

agent, diplomacy, equilibrium, (16 more...)

Neural Information Processing Systems

Country:

Europe > France (0.05)
Europe > Austria (0.05)

Genre: Research Report > New Finding (0.47)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

No-Press Diplomacy: Modeling Multi-Agent Gameplay

Neural Information Processing SystemsDec-25-2025, 16:02:16 GMT

diplomacy, modeling multi-agent gameplay, no-press diplomacy, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.64)

Add feedback

95f2b84de5660ddf45c8a34933a2e66f-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 04:23:05 GMT

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > France (0.05)
Europe > Austria (0.05)

Genre: Research Report > New Finding (0.47)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Reviews: No-Press Diplomacy: Modeling Multi-Agent Gameplay

Neural Information Processing SystemsJan-25-2025, 06:52:51 GMT

The dynamically changing alliances mean that the domain of diplomacy presents unique challenges for agents. I agree with the authors that this means that diplomacy is'deserving of special attention', I would consider the full game to be a grand challenge for multi-agent research. With recent progress in large-scale RL focusing on single-agent and 2-player zero sum games, this problem is particularly timely. This work presents state of the art agents trained with deep learning. To my knowledge this is the first successful application of deep learning to diplomacy.

artificial intelligence, deep learning, machine learning, (13 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.33)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Reviews: No-Press Diplomacy: Modeling Multi-Agent Gameplay

Neural Information Processing SystemsJan-25-2025, 06:52:40 GMT

All reviewers agree that this paper explores interesting territory, i.e., multi-agent Learning in the Diplomacy game. It is a well written and presented paper. The paper has generated quite some discussion after the rebuttal, discussing all pros and cons of the work. The major point in favor of the work (as also indicated by the authors themselves) seems to be that the work lays some ground work for future research in the Diplomacy game, that is known to be very hard and challenging. The biggest point of concern is that the paper presents little innovation in the techniques that it deploys but rather shows how the SOTA can be used/engineered to be successful in this domain to a certain extent, and illustrates the performance of known algorithms.

modeling multi-agent gameplay, no-press diplomacy, reviewer, (4 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.57)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

No-Press Diplomacy: Modeling Multi-Agent Gameplay

Neural Information Processing SystemsOct-10-2024, 10:28:32 GMT

Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal. Reliance on trust and coordination makes Diplomacy the first non-cooperative multi-agent benchmark for complex sequential social dilemmas in a rich environment. In this work, we focus on training an agent that learns to play the No Press version of Diplomacy where there is no dedicated communication channel between players. The model was trained on a new dataset of more than 150,000 human games. Our model is trained by supervised learning (SL) from expert trajectories, which is then used to initialize a reinforcement learning (RL) agent trained through self-play.

diplomacy, modeling multi-agent gameplay, no-press diplomacy

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

Bakhtin, Anton, Wu, David J, Lerer, Adam, Gray, Jonathan, Jacob, Athul Paul, Farina, Gabriele, Miller, Alexander H, Brown, Noam

arXiv.org Artificial IntelligenceOct-11-2022

No-press Diplomacy is a complex strategy game involving both cooperation and competition that has served as a benchmark for multi-agent AI research. While self-play reinforcement learning has resulted in numerous successes in purely adversarial games like chess, Go, and poker, self-play alone is insufficient for achieving optimal performance in domains involving cooperation with humans. We address this shortcoming by first introducing a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitationlearned policy. We prove that this is a no-regret learning algorithm under a modified utility function. We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model. We used RL-DiL-piKL to train an agent we name Diplodocus. In a 200-game no-press Diplomacy tournament involving 62 human participants spanning skill levels from beginner to expert, two Diplodocus agents both achieved a higher average score than all other participants who played more than two games, and ranked first and third according to an Elo ratings model. In two-player zero-sum (2p0s) settings, principled self-play algorithms converge to a minimax equilibrium, which in a balanced game ensures that a player will not lose in expectation regardless of the opponent's strategy (Neumann, 1928). This fact has allowed self-play, even without human data, to achieve remarkable success in 2p0s games like chess (Silver et al., 2018), Go (Silver et al., 2017), poker (Bowling et al., 2015; Brown & Sandholm, 2017), and Dota 2 (Berner et al., 2019). In principle, any finite 2p0s game can be solved via self-play given sufficient compute and memory. However, in games involving cooperation, self-play alone no longer guarantees good performance when playing with humans, even with infinite compute and memory. This is because in complex domains there may be arbitrarily many conventions and expectations for how to cooperate, of which humans may use only a small subset (Lerer & Peysakhovich, 2019). The clearest example of this is language. A self-play agent trained from scratch without human data in a cooperative game involving free-form communication channels would almost certainly not converge to using English as the medium of communication. Obviously, such an agent would perform poorly when paired with a human English speaker. Indeed, prior work has shown that naïve extensions of self-play from scratch without human data perform poorly when playing with humans or human-like agents even in dialogue-free domains that involve cooperation rather than just competition, such as the benchmark games no-press Diplomacy (Bakhtin et al., 2021) and Hanabi (Siu et al., 2021; Cui et al., 2021).

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2210.05492

Country:

Europe > Italy (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
(10 more...)

Genre: Research Report (0.63)

Industry:

Leisure & Entertainment > Games > Chess (0.74)
Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Modeling Strong and Human-Like Gameplay with KL-Regularized Search

Jacob, Athul Paul, Wu, David J., Farina, Gabriele, Lerer, Adam, Bakhtin, Anton, Andreas, Jacob, Brown, Noam

arXiv.org Artificial IntelligenceDec-14-2021

We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior. Imitation learning is effective at predicting human actions but may not match the strength of expert humans, while self-play learning and search techniques (e.g. AlphaZero) lead to strong performance but may produce policies that are difficult for humans to understand and coordinate with. We show in chess and Go that regularizing search policies based on the KL divergence from an imitation-learned policy by applying Monte Carlo tree search produces policies that have higher human prediction accuracy and are stronger than the imitation policy. We then introduce a novel regret minimization algorithm that is regularized based on the KL divergence from an imitation-learned policy, and show that applying this algorithm to no-press Diplomacy yields a policy that maintains the same human prediction accuracy as imitation learning while being substantially stronger.

accuracy, agent, modeling strong and human-like gameplay, (14 more...)

arXiv.org Artificial Intelligence

2112.07544

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Chess (0.68)
Leisure & Entertainment > Games > Go (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Human-Level Performance in No-Press Diplomacy via Equilibrium Search

Gray, Jonathan, Lerer, Adam, Bakhtin, Anton, Brown, Noam

arXiv.org Artificial IntelligenceOct-5-2020

Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings. In contrast, Diplomacy is a game of shifting alliances that involves both cooperation and competition. For this reason, Diplomacy has proven to be a formidable research challenge. In this paper we describe an agent for the no-press variant of Diplomacy that combines supervised learning on human data with one-step lookahead search via external regret minimization. External regret minimization techniques have been behind previous AI successes in adversarial games, most notably poker, but have not previously been shown to be successful in large-scale games involving cooperation. We show that our agent greatly exceeds the performance of past no-press Diplomacy bots, is unexploitable by expert humans, and achieves a rank of 23 out of 1,128 human players when playing anonymous games on a popular Diplomacy website. A primary goal for AI research is to develop agents that can act optimally in real-world multi-agent interactions (i.e., games). However, previous large-scale game AI results have focused on either purely competitive or purely cooperative settings. In contrast, real-world games, such as business negotiations, politics, and traffic navigation, involve a far more complex mixture of cooperation and competition. In such settings, the theoretical grounding for the techniques used in previous AI breakthroughs falls apart. In this paper we augment neural policies trained through imitation learning with regret minimization search techniques, and evaluate on the benchmark game of no-press Diplomacy.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2010.02923

Country: