In extensive-form games with a large number of actions, careful abstraction of the action space is critically important to performance. In this paper we extend previous work on action abstraction using no-limit poker games as our test domains. We show that in such games it is no longer necessary to choose, a priori, one specific range of possible bet sizes. We introduce an algorithm that adjusts the range of bet sizes considered for each bet individually in an iterative fashion. This flexibility results in a substantially improved game value in no-limit Leduc poker. When applied to no-limit Texas Hold'em our algorithm produces an action abstraction that is about one third the size of a state of the art hand-crafted action abstraction, yet has a better overall game value.
Sequential games of perfect information can be solved by backward induction, where solutions to endgames are propagated up the game tree. However, this does not work in imperfect-information games because different endgames can contain states that belong to the same information set and cannot be treated independently. In fact, we show that this approach can fail even in a simple game with a unique equilibrium and a single endgame. Nonetheless, we show that endgame solving can have significant benefits in imperfect-information games with large state and action spaces: computation of exact (rather than approximate) equilibrium strategies, computation of relevant equilibrium refinements, significantly finer-grained action and information abstraction, new information abstraction algorithms that take into account the relevant distribution of players' types entering the endgame, being able to select the coarseness of the action abstraction dynamically, additional abstraction techniques for speeding up endgame solving, a solution to the ``off-tree" problem, and using different degrees of probability thresholding in modeling versus playing. We discuss each of these topics in detail, and introduce techniques that enable one to conduct endgame solving in a scalable way even when the number of states and actions in the game is large. Our experiments on two-player no-limit Texas Hold'em poker show that our approach leads to significant performance improvements in practice.
Equilibrium or near-equilibrium solutions to very large extensive form games are often computed by using abstractions to reduce the game size. A common abstraction technique for games with a large number of available actions is to restrict the number of legal actions in every state. This method has been used to discover equilibrium solutions for the game of no-limit heads-up Texas Hold'em. When using a solution to an abstracted game to play one side in the un-abstracted (real) game, the real opponent actions may not correspond to actions in the abstracted game. The most popular method for handling this situation is to translate opponent actions in the real game to the closest legal actions in the abstracted game. We show that this approach can result in a very exploitable player and propose an alternative solution. We use probabilistic mapping to translate a real action into a probability distribution over actions, whose weights are determined by a similarity metric. We show that this approach significantly reduces the exploitability when using an abstract solution in the real game.
When solving extensive-form games with large action spaces, typically significant abstraction is needed to make the problem manageable from a modeling or computational perspective. When this occurs, a procedure is needed to interpret actions of the opponent that fall outside of our abstraction (by mapping them to actions in our abstraction). This is called an action translation mapping. Prior action translation mappings have been based on heuristics without theoretical justification. We show that the prior mappings are highly exploitable and that most of them violate certain natural desiderata. We present a new mapping that satisfies these desiderata and has significantly lower exploitability than the prior mappings. Furthermore, we observe that the cost of this worst-case performance benefit (low exploitability) is not high in practice; our mapping performs competitively with the prior mappings against no-limit Texas Hold'em agents submitted to the 2012 Annual Computer Poker Competition. We also observe several paradoxes that can arise when performing action abstraction and translation; for example, we show that it is possible to improve performance by including suboptimal actions in our abstraction and excluding optimal actions.