Gradient Sparsification for Communication-Efficient Distributed Optimization

Jianqiao Wangni, Jialei Wang, Ji Liu, Tong Zhang

Neural Information Processing Systems 

In the synchronous stochastic gradient method, each worker processes a random minibatch of its training data, and then the local updates are synchronized by making anAll-Reduce step, which aggregates stochastic gradients from all workers, and taking aBroadcast step that transmits the updated parameter vector back toallworkers.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found