Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Oct-9-2024, 12:21:54 GMT–Neural Information Processing Systems

Model-based reinforcement learning (RL), which finds an optimal policy using an empirical model, has long been recognized as one of the cornerstones of RL. It is especially suitable for multi-agent RL (MARL), as it naturally decouples the learning and the planning phases, and avoids the non-stationarity problem when all agents are improving their policies simultaneously using samples. Though intuitive and widely-used, the sample complexity of model-based MARL algorithms has been investigated relatively much less often. In this paper, we aim to address the fundamental open question about the sample complexity of model-based MARL. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model of state transition.

artificial intelligence, model-based multi-agent rl, near-optimal sample complexity, (5 more...)

Neural Information Processing Systems

Oct-9-2024, 12:21:54 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)