Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning Jianzhun Shao, Y un Qu
–Neural Information Processing Systems
MARL in real scenarios is still challenging due to the same safety and efficiency concerns in single-agent setting, then it is worth conducting investigation for offline RL in multi-agent setting.
Neural Information Processing Systems
Feb-17-2026, 23:20:28 GMT