Review for NeurIPS paper: Practical Low-Rank Communication Compression in Decentralized Deep Learning

Neural Information Processing Systems 

Summary and Contributions: Post-rebuttal update: I am happy with the authors' response to my question on the bounded variance assumption. I maintain that the paper should be accepted. The authors take inspiration from a power-method based compression method for efficient communication in distributed optimization. Instead, they apply this idea to the'decentralized' setting, where communication is limited to neighboring nodes on some network topology. A long known property of the power method is its lightness in terms of hyper parameter tuning.