Efficient Last-Iterate Convergence in Solving Extensive-Form Games
–Neural Information Processing Systems
To establish last-iterate convergence for Counterfactual Regret Minimization (CFR) algorithms in learning a Nash equilibrium (NE) of extensive-form games (EFGs), recent studies reformulate learning an NE of the original EFG as learning the NEs of a sequence of (perturbed) regularized EFGs. Hence, proving last-iterate convergence in solving the original EFG reduces to proving last-iterate convergence in solving (perturbed) regularized EFGs. However, these studies only establish last-iterate convergence for Online Mirror Descent (OMD)-based CFR algorithms instead of Regret Matching (RM)-based CFR algorithms in solving perturbed regularized EFGs, resulting in a poor empirical convergence rate, as RM-based CFR algorithms typically outperform OMD-based CFR algorithms. In addition, as solving multiple perturbed regularized EFGs is required, fine-tuning across multiple perturbed regularized EFGs is infeasible, making parameter-free algorithms highly desirable. This paper show that CFR+, a classical parameter-free RM-based CFR algorithm, achieves last-iterate convergence in learning an NE of perturbed regularized EFGs. This is the first parameter-free last-iterate convergence for RM-based CFR algorithms in perturbed regularized EFGs. Leveraging CFR+ to solve perturbed regularized EFGs, we get Reward Transformation CFR+ (RTCFR+). Importantly, we extend prior work on the parameter-free property of CFR+, enhancing its stability, which is vital for the empirical convergence of RTCFR+. Experiments show that RTCFR+ exhibits a significantly faster empirical convergence rate than existing algorithms that achieve theoretical last-iterate convergence.
Neural Information Processing Systems
Jun-19-2026, 19:46:26 GMT
- Country:
- Asia > China (0.28)
- North America > Canada (0.27)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Leisure & Entertainment > Games (1.00)
- Information Technology (0.92)
- Technology: