Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

Jan-19-2025, 02:49:43 GMT–Neural Information Processing Systems

We study the problem of finding the Nash equilibrium in a two-player zero-sum Markov game. Due to its formulation as a minimax optimization program, a natural approach to solve the problem is to perform gradient descent/ascent with respect to each player in an alternating fashion. However, due to the non-convexity/non-concavity of the underlying objective function, theoretical understandings of this method are limited. In our paper, we consider solving an entropy-regularized variant of the Markov game. The regularization introduces structures into the optimization landscape that make the solutions more identifiable and allow the problem to be solved more efficiently.

algorithm, regularized gradient descent ascent, two-player zero-sum markov game, (2 more...)

Neural Information Processing Systems

Jan-19-2025, 02:49:43 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)