Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization Xiangsen Wang
–Neural Information Processing Systems
Despite some success in the single-agent setting, offline multi-agent RL (MARL) remains to be a challenge. The large joint state-action space and the coupled multi-agent behaviors pose extra complexities for offline policy optimization.
Neural Information Processing Systems
Oct-9-2025, 03:35:21 GMT