Review for NeurIPS paper: On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems

Neural Information Processing Systems 

Weaknesses: There are a lot similar results in slightly different regime, which makes this work looks incremental. In the case of GD, this Morse assumption can be resolved by using a stronger stable manifold theorem in "Michael Shub. I suspect a similar combination might go through here? Usually one view asymptotic results (this paper) weaker than non-asymptotic results (earlier papers), it is also not clear from this paper if one can obtain probability 1 result by modifying the existing high probability result with Borel Cantelli lemma and a bit extra work.