Hawkin, John Alexander
Using Sliding Windows to Generate Action Abstractions in Extensive-Form Games
Hawkin, John Alexander (University of Alberta) | Holte, Robert (University of Alberta) | Szafron, Duane (University of Alberta)
In extensive-form games with a large number of actions, careful abstraction of the action space is critically important to performance. In this paper we extend previous work on action abstraction using no-limit poker games as our test domains. We show that in such games it is no longer necessary to choose, a priori, one specific range of possible bet sizes. We introduce an algorithm that adjusts the range of bet sizes considered for each bet individually in an iterative fashion. This flexibility results in a substantially improved game value in no-limit Leduc poker. When applied to no-limit Texas Hold'em our algorithm produces an action abstraction that is about one third the size of a state of the art hand-crafted action abstraction, yet has a better overall game value.
Automated Action Abstraction of Imperfect Information Extensive-Form Games
Hawkin, John Alexander (University of Alberta) | Holte, Robert (University of Alberta) | Szafron, Duane (University of Alberta)
Multi-agent decision problems can often be formulated as extensive-form games. We focus on imperfect information extensive-form games in which one or more actions at many decision points have an associated continuous or many-valued parameter. A stock trading agent, in addition to deciding whether to buy or not, must decide how much to buy. In no-limit poker, in addition to selecting a probability for each action, the agent must decide how much to bet for each betting action. Selecting values for these parameters makes these games extremely large. Two-player no-limit Texas Hold'em poker with stacks of 500 big blinds has approximately 10 71 states, which is more than 10 50 times more states than two-player limit Texas Hold'em. The main contribution of this paper is a technique that abstracts a game's action space by selecting one, or a small number, of the many values for each parameter. We show that strategies computed using this new algorithm for no-limit Leduc poker exhibit significant utility gains over epsilon-Nash equilibrium strategies computed with standard, hand-crafted parameter value abstractions.