Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL Minshuo Chen 1 Yan Li1 Ethan Wang 1 Zhuoran Yang
–Neural Information Processing Systems
Mean-Field Multi-Agent Reinforcement Learning (MF-MARL) is attractive in the applications involving a large population of homogeneous agents, as it exploits the permutation invariance of agents and avoids the curse of many agents. Most existing results only focus on online settings, in which agents can interact with the environment during training. In some applications such as social welfare optimization, however, the interaction during training can be prohibitive or even unethical in the societal systems. To bridge such a gap, we propose a SAFARI (peSsimistic meAn-Field vAlue iteRatIon) algorithm for off-line MF-MARL, which only requires a handful of pre-collected experience data.
Neural Information Processing Systems
May-29-2025, 11:21:57 GMT