Understanding Top-k Sparsification in Distributed Deep Learning

Open in new window