Gradient Sparsification for Communication-Efficient Distributed Optimization

Jianqiao Wangni, Jialei Wang, Ji Liu, Tong Zhang

Neural Information Processing Systems 

In the synchronous stochastic gradient method, each worker processes a random minibatch of its training data, and then the local updates are synchronized by making an All-Reduce step, which aggregates stochastic gradients from all workers, and taking a Broadcast step that transmits the updated parameter vector back to all workers.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found