Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates

Open in new window