Supplementary Materials For XDO: A Double Oracle Algorithm for Extensive-Form Games 1 Proofs Proposition 1. In XDO with an null

Neural Information Processing Systems 

's population policies chooses action In a given iteration, consider the restricted game for a single GMP game. If player 2 is not allowed an action unavailable to player 1, player 2's BR will be a new action In pk,m q-clone GMP with n classes, XDO adds at most 2 n actions for each player . In total, 2n actions may be added for each player.Proposition 6. Like in that work, we represent actions that are in the restricted game by bold arrows. Extensive-form pure strategies specify an action at every infostate.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found