Review for NeurIPS paper: Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning

Neural Information Processing Systems 

Additional Feedback: I like authors tried their experiments in various perspectives, but experience sharing is occasionally seen from the existing literature. For example, although it wasn't mentioned in the paper, [1] used experience sharing among agents for their implementation, and I believe there may be other works with the topic of "MARL for homogeneous agents". The main reason I score "below acceptance" is that quite weak baselines seem to be used: - In Table 1, QMIX and MADDPG highly underperforms SEAC and other baselines (IAC, SNAC). However, since methods with CTDE are mostly more stable than independent learning methods, I think this part should be explained in more detail. Although other reviewers have argued the strength of this work from the importance weighting and simplicity of methods, I still think there should have been stronger baselines.