Integrating Opponent Models with Monte-Carlo Tree Search in Poker

AAAI Conferences

In this paper we apply a Monte-Carlo Tree Search implementation that is boosted with domain knowledge to the game of poker. More specifically, we integrate an opponent model in the Monte-Carlo Tree Search algorithm to produce a strong poker playing program. Opponent models allow the search algorithm to focus on relevant parts of the game-tree. We use an opponent modelling approach that starts from a (learned) prior, i.e., general expectations about opponent behavior, and then learns a relational regression tree-function that adapts these priors to specific opponents. Our modelling approach can generate detailed game features or relations on-the-fly. Additionally, using a prior we can already make reasonable predictions even when limited experience is available for a particular player. We show that Monte-Carlo Tree Search with integrated opponent models performs well against state-of-the-art poker programs.


Learning Strategies for Opponent Modeling in Poker

AAAI Conferences

In poker, players tend to play sub-optimally due to theuncertainty in the game. Payoffs can be maximized byexploiting these sub-optimal tendencies. One way of realizingthis is to acquire the opponent strategy by recognizingthe key patterns in its style of play. Existing studieson opponent modeling in poker aim at predicting opponent’sfuture actions or estimating opponent’s hand.In this study, we propose a machine learning methodfor acquiring the opponent’s behavior for the purpose ofpredicting opponent’s future actions.We derived a numberof features to be used in modeling opponent’s strategy.Then, an ensemble learning method is proposed forgeneralizing the model. The proposed approach is testedon a set of test scenarios and shown to be effective.


Identifying Player\'s Strategies in No Limit Texas Hold\'em Poker through the Analysis of Individual Moves

arXiv.org Artificial Intelligence

The development of competitive artificial Poker playing agents has proven to be a challenge, because agents must deal with unreliable information and deception which make it essential to model the opponents in order to achieve good results. This paper presents a methodology to develop opponent modeling techniques for Poker agents. The approach is based on applying clustering algorithms to a Poker game database in order to identify player types based on their actions. First, common game moves were identified by clustering all players\' moves. Then, player types were defined by calculating the frequency with which the players perform each type of movement. With the given dataset, 7 different types of players were identified with each one having at least one tactic that characterizes him. The identification of player types may improve the overall performance of Poker agents, because it helps the agents to predict the opponent\'s moves, by associating each opponent to a distinct cluster.


Improving Performance in Imperfect-Information Games with Large State and Action Spaces by Solving Endgames

AAAI Conferences

Sequential games of perfect information can be solved by backward induction, where solutions to endgames are propagated up the game tree.  However, this does not work in imperfect-information games because different endgames can contain states that belong to the same information set and cannot be treated independently.  In fact, we show that this approach can fail even in a simple game with a unique equilibrium and a single endgame.  Nonetheless, we show that endgame solving can have significant benefits in imperfect-information games with large state and action spaces: computation of exact (rather than approximate) equilibrium strategies, computation of relevant equilibrium refinements, significantly finer-grained action and information abstraction, new information abstraction algorithms that take into account the relevant distribution of players' types entering the endgame, being able to select the coarseness of the action abstraction dynamically, additional abstraction techniques for speeding up endgame solving, a solution to the ``off-tree" problem, and using different degrees of probability thresholding in modeling versus playing. We discuss each of these topics in detail, and introduce techniques that enable one to conduct endgame solving in a scalable way even when the number of states and actions in the game is large. Our experiments on two-player no-limit Texas Hold'em poker show that our approach leads to significant performance improvements in practice.


Andrew Gilpin and Tuomas Sandholm

AAAI Conferences

We present a game theory-based heads-up Texas Hold'em poker player, GS1. To overcome the computational obstacles stemming from Texas Hold'em's gigantic game tree, the player employs our automated abstraction techniques to reduce the complexity of the strategy computations. Texas Hold'em consists of four betting rounds. Our player solves a large linear program (offline) to compute strategies for the abstracted first and second rounds. After the second betting round, our player updates the probability of each possible hand based on the observed betting actions in the first two rounds as well as the revealed cards. Using these updated probabilities, our player computes in real-time an equilibrium approximation for the last two abstracted rounds. We demonstrate that our player, which incorporates very little poker-specific knowledge, is competitive with leading poker-playing programs which incorporate extensive domain knowledge, as well as with advanced human players.