Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives

Open in new window