Fictitious Play Outperforms Counterfactual Regret Minimization
–arXiv.org Artificial Intelligence
In two-player zero-sum games a Nash equilibrium strategy is guaranteed to win (or tie) in expectation against any opposing strategy by the minimax theorem. In games with m ore than two players there can be multiple equilibria with different values to the players, and follow ing one has no performance guarantee; however, it was shown that a Nash equilibrium strategy defeated a variet y of agents submitted for a class project in a 3-player imperfect-information game, Kuhn poker [13]. Thi s demonstrates that Nash equilibrium strategies can be successful in practice despite the fact that they do no t have a performance guarantee. While Nash equilibrium can be computed in polynomial time fo r two-player zero-sum games, it is PPAD-hard to compute for nonzero-sum and games with 3 or mor e agents and widely believed that no efficient algorithms exist [8, 9]. Counterfactual regret mi nimization (CFR) is an iterative self-play procedure that has been proven to converge to Nash equilibrium in two-p layer zero-sum [28].
arXiv.org Artificial Intelligence
Jan-29-2020
- Country:
- North America
- United States > Texas (0.05)
- Canada
- Alberta (0.14)
- British Columbia (0.04)
- North America
- Genre:
- Research Report (0.84)
- Industry:
- Leisure & Entertainment > Games > Poker (0.47)
- Technology:
- Information Technology
- Game Theory (1.00)
- Artificial Intelligence
- Representation & Reasoning > Agents (0.72)
- Games > Poker (0.69)
- Information Technology