Double Quantization for Communication-Efficient Distributed Optimization

Yue Yu, Jiaxiang Wu, Longbo Huang

Neural Information Processing Systems 

Modern distributed training of machine learning models often suffers from high communication overhead for synchronizing stochastic gradients and model parameters.