Stochastic Normalized Gradient Descent with Momentum for Large Batch Training

Open in new window