The Power of Perturbation under Sampling in Solving Extensive-Form Games

Masaka, Wataru, Sakamoto, Mitsuki, Abe, Kenshi, Ariu, Kaito, Sandholm, Tuomas, Iwasaki, Atsushi

Jan-27-2025–arXiv.org Artificial Intelligence

This paper investigates how perturbation does and does not improve the Follow-the-Regularized-Leader (FTRL) algorithm in imperfect-information extensive-form games. Perturbing the expected payoffs guarantees that the FTRL dynamics reach an approximate equilibrium, and proper adjustments of the magnitude of the perturbation lead to a Nash equilibrium (\textit{last-iterate convergence}). This approach is robust even when payoffs are estimated using sampling -- as is the case for large games -- while the optimistic approach often becomes unstable. Building upon those insights, we first develop a general framework for perturbed FTRL algorithms under \textit{sampling}. We then empirically show that in the last-iterate sense, the perturbed FTRL consistently outperforms the non-perturbed FTRL. We further identify a divergence function that reduces the variance of the estimates for perturbed payoffs, with which it significantly outperforms the prior algorithms on Leduc poker (whose structure is more asymmetric in a sense than that of the other benchmark games) and consistently performs smooth convergence behavior on all the benchmark games.

artificial intelligence, exploitability, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Jan-27-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Netherlands (0.04)

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment > Games (0.70)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found