Reviews: TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning
–Neural Information Processing Systems
The authors tried to reduce communication cost in distributed deep learning. They proposed a new method called Ternary Gradients to reduce the overhead of gradient synchronization. The technique is correct and the authors made some experiments to evaluate their method. Overall, the method is novel and interesting and may have significant impact on large scale deep learning. Some comments below for the authors to improve their paper.
Neural Information Processing Systems
Oct-8-2024, 03:22:18 GMT
- Technology: