Monte Carlo Sampling for Regret Minimization in Extensive Games

Lanctot, Marc, Waugh, Kevin, Zinkevich, Martin, Bowling, Michael

Feb-15-2020, 02:28:18 GMT–Neural Information Processing Systems

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation.

extensive game, monte carlo sampling, regret minimization, (3 more...)

Neural Information Processing Systems

Feb-15-2020, 02:28:18 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.48)