The global convergence time of stochastic gradient descent in non-convex landscapes: Sharp estimates via large deviations

Open in new window