Review for NeurIPS paper: Succinct and Robust Multi-Agent Communication With Temporal Message Control

Neural Information Processing Systems 

Weaknesses: I do not understand the purpose of halting training process. Without the convergence, how to assess the real benefit of the proposed method. The two regularizers serve mainly for communication reduction, and it is not directly correlated with the objective of RL. So, why does TMC QMIX perform AC, as both use full communication in training). This is not clear, and even counter-intuitive.