Policy Optimization for Markov Games: Unified Framework and Faster Convergence

Dec-24-2025, 17:31:46 GMT–Neural Information Processing Systems

We begin by proposing an algorithm framework for two-player zero-sum Markov Games in the full-information setting, where each iteration consists of a policy update step at each state using a certain matrix game algorithm, and a value update step with a certain learning rate.

algorithm, markov game, policy optimization, (10 more...)

Neural Information Processing Systems

Dec-24-2025, 17:31:46 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)