Goto

Collaborating Authors

 extensive form game


Review for NeurIPS paper: Small Nash Equilibrium Certificates in Very Large Games

Neural Information Processing Systems

The abstraction/pseudogame idea is intuitive, which is formed by merging some nodes in the extensive form tree into one meta-node with upper and lower bound representing the optimistic and pessimistic payoffs of all the possible outcomes.


Responsibility in Extensive Form Games

Shi, Qi

arXiv.org Artificial Intelligence

Two different forms of responsibility, counterfactual and seeing-to-it, have been extensively discussed in the philosophy and AI in the context of a single agent or multiple agents acting simultaneously. Although the generalisation of counterfactual responsibility to a setting where multiple agents act in some order is relatively straightforward, the same cannot be said about seeing-to-it responsibility. Two versions of seeing-to-it modality applicable to such settings have been proposed in the literature. Neither of them perfectly captures the intuition of responsibility. This paper proposes a definition of seeing-to-it responsibility for such settings that amalgamate the two modalities. This paper shows that the newly proposed notion of responsibility and counterfactual responsibility are not definable through each other and studies the responsibility gap for these two forms of responsibility. It shows that although these two forms of responsibility are not enough to ascribe responsibility in each possible situation, this gap does not exist if higher-order responsibility is taken into account.


The strategy of conflict and cooperation

Ismail, Mehmet S.

arXiv.org Artificial Intelligence

This paper introduces a unified framework called cooperative extensive form games, which (i) generalizes standard non-cooperative games, and (ii) allows for more complex coalition formation dynamics than previous concepts like coalition-proof Nash equilibrium. Central to this framework is a novel solution concept called cooperative equilibrium system (CES). CES differs from Nash equilibrium in two important respects. First, a CES is immune to both unilateral and multilateral `credible' deviations. Second, unlike Nash equilibrium, whose stability relies on the assumption that the strategies of non-deviating players are held fixed, CES allows for the possibility that players may regroup and adjust their strategies in response to a deviation. The main result establishes that every cooperative extensive form game, possibly with imperfect information, possesses a CES. For games with perfect information, the proof is constructive. This framework is broadly applicable in contexts such as oligopolistic markets and dynamic political bargaining.


Translating Extensive Form Games to Open Games with Agency

Capucci, Matteo, Ghani, Neil, Ledent, Jérémy, Forsberg, Fredrik Nordvall

arXiv.org Artificial Intelligence

We show open games cover extensive form games with both perfect and imperfect information. Doing so forces us to address two current weaknesses in open games: the lack of a notion of player and their agency within open games, and the lack of choice operators. Using the former we construct the latter, and these choice operators subsume previous proposed operators for open games, thereby making progress towards a core, canonical and ergonomic calculus of game operators. Collectively these innovations increase the level of compositionality of open games, and demonstrate their expressiveness.


No-harm principle, rationality, and Pareto optimality in games

Heap, Shaun Hargreaves, Ismail, Mehmet S.

arXiv.org Artificial Intelligence

Mill's classic argument for liberty requires that people's exercise of freedom should be governed by a no-harm principle (NHP). In this paper, we develop the concept of a no-harm equilibrium in $n$-person games where players maximize utility subject to the constraint of the NHP. Our main result is in the spirit of the fundamental theorems of welfare economics. We show that for every initial `reference point' in a game the associated no-harm equilibrium is Pareto efficient and, conversely, every Pareto efficient point can be supported as a no-harm equilibrium for some initial reference point.


Fast Convergence of Optimistic Gradient Ascent in Network Zero-Sum Extensive Form Games

Piliouras, Georgios, Ratliff, Lillian, Sim, Ryann, Skoulakis, Stratis

arXiv.org Artificial Intelligence

The study of learning in games has thus far focused primarily on normal form games. In contrast, our understanding of learning in extensive form games (EFGs) and particularly in EFGs with many agents lags far behind, despite them being closer in nature to many real world applications. We consider the natural class of Network Zero-Sum Extensive Form Games, which combines the global zero-sum property of agent payoffs, the efficient representation of graphical games as well the expressive power of EFGs. We examine the convergence properties of Optimistic Gradient Ascent (OGA) in these games. We prove that the time-average behavior of such online learning dynamics exhibits $O(1/T)$ rate convergence to the set of Nash Equilibria. Moreover, we show that the day-to-day behavior also converges to Nash with rate $O(c^{-t})$ for some game-dependent constant $c>0$.


Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Marris, Luke, Muller, Paul, Lanctot, Marc, Tuyls, Karl, Grapael, Thore

arXiv.org Artificial Intelligence

Outside of normal form (NF) games, this problem setting Two-player, constant-sum games are well studied arises in multi-agent training when dealing with empirical in the literature, but there has been limited games (also called meta-games), where a game payoff progress outside of this setting. We propose Joint tensor is populated with expected outcomes between Policy-Space Response Oracles (JPSRO), an algorithm agents playing an extensive form (EF) game, for example for training agents in n-player, general-sum the StarCraft League (Vinyals et al., 2019) and Policy-Space extensive form games, which provably converges Response Oracles (PSRO) (Lanctot et al., 2017), a recent to an equilibrium. We further suggest correlated variant of which reached state-of-the-art results in Stratego equilibria (CE) as promising meta-solvers, and Barrage (McAleer et al., 2020).


Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

Ling, Chun Kai, Fang, Fei, Kolter, J. Zico

arXiv.org Artificial Intelligence

With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions. Although a recent work provides a powerful differentiable end-to-end learning frameworks which embed a game solver within a deep-learning framework, allowing unknown game parameters to be learned via backpropagation, this framework faces significant limitations when applied to boundedly rational human agents and large scale problems, leading to poor practicality. In this paper, we address these limitations and propose a framework that is applicable for more practical settings. First, seeking to learn the rationality of human agents in complex two-player zero-sum games, we draw upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning. Second, to scale up to large, real-world scenarios, we propose an efficient first-order primal-dual method which exploits the structure of extensive-form games, yielding significantly faster computation for both game solving and gradient computation. When tested on randomly generated games, we report speedups of orders of magnitude over previous approaches. We also demonstrate the effectiveness of our model on both real-world one-player settings and synthetic data.


Revisiting CFR+ and Alternating Updates

Burch, Neil, Moravcik, Matej, Schmid, Martin

Journal of Artificial Intelligence Research

The CFR+ algorithm for solving imperfect information games is a variant of the popular CFR algorithm, with faster empirical performance on a range of problems. It was introduced with a theoretical upper bound on solution error, but subsequent work showed an error in one step of the proof. We provide updated proofs to recover the original bound.


What game are we playing? End-to-end learning in normal and extensive form games

Ling, Chun Kai, Fang, Fei, Kolter, J. Zico

arXiv.org Machine Learning

Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents. This paper deals with the relatively under-explored but equally important "inverse" setting, where the parameters of the underlying game are not known to all agents, but must be learned through observations. We propose a differentiable, end-to-end learning framework for addressing this task. In particular, we consider a regularized version of the game, equivalent to a particular form of quantal response equilibrium, and develop 1) a primal-dual Newton method for finding such equilibrium points in both normal and extensive form games; and 2) a backpropagation method that lets us analytically compute gradients of all relevant game parameters through the solution itself. This ultimately lets us learn the game by training in an end-to-end fashion, effectively by integrating a "differentiable game solver" into the loop of larger deep network architectures. We demonstrate the effectiveness of the learning method in several settings including poker and security game tasks.