Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

Open in new window