Near-OptimalReinforcementLearningwithSelf-Play
–Neural Information Processing Systems
This paper considers the problem of designing optimal algorithms for reinforcement learning in two-player zero-sum games. We focus on self-play algorithms which learn theoptimal policy by playing againstitself without any direct supervision.
Neural Information Processing Systems
Feb-7-2026, 14:48:56 GMT
- Country: