Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction

Neural Information Processing Systems 

The recently proposed stochastic Polyak stepsize (SPS) and stochastic line-search (SLS) for SGD have shown remarkable effectiveness when training over-parameterized models. However, two issues remain unsolved in this line of work.