Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning Jianzhun Shao, Y un Qu
–Neural Information Processing Systems
MARL in real scenarios is still challenging due to the same safety and efficiency concerns in single-agent setting, then it is worth conducting investigation for offline RL in multi-agent setting.
Neural Information Processing Systems
Oct-9-2025, 11:47:02 GMT