Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model Gen Li UPenn Y uejie Chi CMU Y uting Wei UPenn Y uxin Chen UPenn

Aug-15-2025, 07:38:12 GMT–Neural Information Processing Systems

All prior results suffer from at least one of the two obstacles: the curse of multiple agents and the barrier of long horizon, regardless of the sampling protocol in use. We take a step towards settling this problem, assuming access to a flexible sampling mechanism: the generative model. Focusing on non-stationary finite-horizon Markov games, we develop a fast learning algorithm called Q-FTRL and an adaptive sampling scheme that leverage the optimism principle in online adversarial learning (particularly the Follow-the-Regularized-Leader (FTRL) method).

algorithm, arxiv preprint arxiv, markov game, (11 more...)

Neural Information Processing Systems

Aug-15-2025, 07:38:12 GMT

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.46)
- Instructional Material (0.34)

Industry:
- Leisure & Entertainment (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
62b4fea131cfd5b7504eae356b75bbd8-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found