Global Convergence and Stability of Stochastic Gradient Descent

Neural Information Processing Systems 

In this work, we demonstrate the restrictiveness of these assumptions using three canonical models in machine learning.