Goto

Collaborating Authors

 Ganzfried, Sam


Nonparametric Strategy Test

arXiv.org Artificial Intelligence

We present a nonparametric statistical test for determining whether an agent is following a given mixed strategy in a repeated strategic-form game given samples of the agent's play. This involves two components: determining whether the agent's frequencies of pure strategies are sufficiently close to the target frequencies, and determining whether the pure strategies selected are independent between different game iterations. Our integrated test involves applying a chi-squared goodness of fit test for the first component and a generalized Wald-Wolfowitz runs test for the second component. The results from both tests are combined using Bonferroni correction to produce a complete test for a given significance level $\alpha.$ We applied the test to publicly available data of human rock-paper-scissors play. The data consists of 50 iterations of play for 500 human players. We test with a null hypothesis that the players are following a uniform random strategy independently at each game iteration. Using a significance level of $\alpha = 0.05$, we conclude that 305 (61%) of the subjects are following the target strategy.


Observable Perfect Equilibrium

arXiv.org Artificial Intelligence

While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.


Bayesian Opponent Modeling in Multiplayer Imperfect-Information Games

arXiv.org Artificial Intelligence

In many real-world settings agents engage in strategic interactions with multiple opposing agents who can employ a wide variety of strategies. The standard approach for designing agents for such settings is to compute or approximate a relevant game-theoretic solution concept such as Nash equilibrium and then follow the prescribed strategy. However, such a strategy ignores any observations of opponents' play, which may indicate shortcomings that can be exploited. We present an approach for opponent modeling in multiplayer imperfect-information games where we collect observations of opponents' play through repeated interactions. We run experiments against a wide variety of real opponents and exact Nash equilibrium strategies in three-player Kuhn poker and show that our algorithm significantly outperforms all of the agents, including the exact Nash equilibrium strategies.


Fictitious Play with Maximin Initialization

arXiv.org Artificial Intelligence

Nash equilibrium is the central solution concept in game theory. While a Nash equilibrium can be computed in polynomial time for two-player zero-sum games, it is PPAD-hard for two-player general-sum and multiplayer games and widely believed that no efficient algorithms exist [6, 7, 8]. The best algorithm for computing an exact Nash equilibrium in multiplayer games is based on a non-convex quadratic program formulation and only scales to relatively small games [10]. For larger games several iterative algorithms have been considered; however, they have no theoretical guarantees and may have an extremely high degree of error. It has recently been shown that fictitious play produces a smaller degree of equilibrium approximation error in these games than regret minimization [11], though the average error still becomes relatively large as the game size increases. For example, for 3-player games with 10 strategies per player and all payoffs uniform random in [0,1], the average equilibrium error from fictitious play is 0.056. The classic version of fictitious play initializes strategies for all players to play all actions with equal probability. In this paper we will explore more sophisticated initialization approaches to improve the algorithm's performance. A strategic-form game consists of a finite set of players N = {1,..., n}, a finite set of pure strategies S


Safe Equilibrium

arXiv.org Artificial Intelligence

In designing a strategy for a multiagent interaction an agent must balance between the assumption that opponents are behaving rationally with the risks that may occur if opponents behave irrationally. Most classic game-theoretic solution concepts, such as Nash equilibrium (NE), assume that all players are behaving rationally (and that this fact is common knowledge). On the other hand, a maximin strategy plays a strategy that has the largest worst-case guaranteed expected payoff; this limits the potential downside against a worstcase and potentially irrational opponent, but can also cause us to achieve significantly lower payoff against rational opponents. In two-player zero-sum games, Nash equilibrium and maximin strategies are equivalent (by the minimax theorem), and these two goals are completely aligned. But in non-zero-sum games and games with more than two players, this is not the case. In these games we can potentially obtain arbitrarily low payoff by following a Nash equilibrium strategy, but if we follow a maximin strategy will likely be playing far too conservatively. While the assumption that opponents are exhibiting a degree of rationality, as well as the desire to limit worst-case performance in the case of irrational opponents, are both desirable, neither the Nash equilibrium nor maximin solution concept is definitively compelling on its own. We propose a new solution concept that balances between these two extremes.


Human strategic decision making in parametrized games

arXiv.org Artificial Intelligence

Strong algorithms have been developed for game classes with many elements of complexity. For example, algorithms were recently able to defeat human professional players in 2-player [16, 3] and 6-player no-limit Texas hold'em [4]. These games have imperfect information, sequential actions, very large state spaces, and the latter has more than two players (solving multiplayer games is more challenging than two-player zero-sum games from a complexity-theoretic perspective). However, these algorithms all require an extremely large amount of computational resources for offline and/or online computations and for optimizing neural network hyperparameters. The algorithms also have a further limitation in that they are using all these resources just to solve for one very specific version of the game (e.g., Libratus and DeepStack assumed that all players start the hand with 200 times the big blind, and Pluribus assumed that all players start the hand with 100 times the big blind).


Computing Nash Equilibria in Multiplayer DAG-Structured Stochastic Games with Persistent Imperfect Information

arXiv.org Artificial Intelligence

Many important real-world settings contain multiple players interacting over an unknown duration with probabilistic state transitions, and are naturally modeled as stochastic games. Prior research on algorithms for stochastic games has focused on two-player zero-sum games, games with perfect information, and games with imperfect-information that is local and does not extend between game states. We present an algorithm for approximating Nash equilibrium in multiplayer general-sum stochastic games with persistent imperfect information that extends throughout game play. We experiment on a 4-player imperfect-information naval strategic planning scenario. Using a new procedure, we are able to demonstrate that our algorithm computes a strategy that closely approximates Nash equilibrium in this game.


Algorithm for Computing Approximate Nash equilibrium in Continuous Games with Application to Continuous Blotto

arXiv.org Artificial Intelligence

Successful algorithms have been developed for computing Nash equilibrium in a variety of finite game classes. However, solving continuous games---in which the pure strategy space is (potentially uncountably) infinite---is far more challenging. Nonetheless, many real-world domains have continuous action spaces, e.g., where actions refer to an amount of time, money, or other resource that is naturally modeled as being real-valued as opposed to integral. We present a new algorithm for computing Nash equilibrium strategies in continuous games. In addition to two-player zero-sum games, our algorithm also applies to multiplayer games and games of imperfect information. We experiment with our algorithm on a continuous imperfect-information Blotto game, in which two players distribute resources over multiple battlefields. Blotto games have frequently been used to model national security scenarios and have also been applied to electoral competition and auction theory. Experiments show that our algorithm is able to quickly compute close approximations of Nash equilibrium strategies for this game.


Prediction of Bayesian Intervals for Tropical Storms

AAAI Conferences

Building on recent research for prediction of hurricane trajectories using recurrent neural networks (RNNs), we have developed improved methods and generalized the approach to predict Bayesian intervals in addition to simple point estimates. Tropical storms are capable of causing severe damage, so accurately predicting their trajectories can bring significant benefits to cities and lives, especially as they grow more intense due to climate change effects. By implementing the Bayesian interval using dropout in an RNN, we improve the actionability of the predictions, for example by estimating the areas to evacuate in the landfall region. We used an RNN to predict the trajectory of the storms at 6-hour intervals. We used latitude, longitude, windspeed, and pressure features from a Statistical Hurricane Intensity Prediction Scheme (SHIPS) dataset of about 500 tropical storms in the Atlantic Ocean. Our results show how neural network dropout values affect predictions and intervals.


Most Important Fundamental Rule of Poker Strategy

AAAI Conferences

Poker is a large complex game of imperfect information, which has been singled out as a major AI challenge problem. Recently there has been a series of breakthroughs culminating in agents that have successfully defeated the strongest human players in two-player no-limit Texas hold ’em. The strongest agents are based on algorithms for approximating Nash equilibrium strategies, which are stored in massive binary files and unintelligible to humans. A recent line of research has explored approaches for extrapolating knowledge from strong game-theoretic strategies that can be understood by humans. This would be useful when humans are the ultimate decision maker and allow humans to make better decisions from massive algorithmically-generated strategies. Using techniques from machine learning we have uncovered a new simple, fundamental rule of poker strategy that leads to a significant improvement in performance over the best prior rule and can also easily be applied by human players.