Goto

Collaborating Authors

 Statistical Learning




An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness

Neural Information Processing Systems

However, it remains unclear if we can further improve the convergence rate when the assumptions for the function in the population level also hold for each random realization almost surely (e.g., Lipschitzness of each realization of the stochastic gradient).








CD_GraB_camera_ready

Neural Information Processing Systems

Whereas RR arbitrarily permutes training examples, GraB leverages stale gradients from prior epochs to order examples -- achieving a provably faster convergence rate than RR.