Adaptive Top-K in SGD for Communication-Efficient Distributed Learning

Open in new window