A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

Dec-27-2025, 04:13:37 GMT–Neural Information Processing Systems

In this work, we study two-player zero-sum stochastic games and develop a variant of the smoothed best-response learning dynamics that combines independent learning dynamics for matrix games with the minimax value iteration for stochastic games. The resulting learning dynamics are payoff-based, convergent, rational, and symmetric between the two players.

finite-sample analysis, independent learning, payoff-based independent learning, (6 more...)

Neural Information Processing Systems

Dec-27-2025, 04:13:37 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.42)