MARL in real scenarios is still challenging due to the same safety and efficiency concerns in single-agent setting, then it is worth conducting investigation for offline RL in multi-agent setting.
We present a novel multi-agent RL approach, Selective Multi-Agent Prioritized Experience Relay, in which agents share with other agents a limited number of transitions they observe during training.