Fast Mixingof Stochastic Gradient Descent with Normalizationand Weight Decay

Open in new window