Goto

Collaborating Authors

 sgc condition



SCRN escape saddle-points and converge to local minimizers faster under Strong Growth Condition (SGC) (which

Neural Information Processing Systems

We thank all the reviewers for their valuable comments. Prior works (e.g., [VBS18]) considered only convergence to critical We provide our results in both the zeroth and higher order settings. SGC assumption for unbounded functions, which was not done before in the literature. SCRN is also significantly involved under SGC (especially in zeroth-order setup); see also Remark 6 and 7. Please see Lines 2-10 above. However, the method in [AL18] is a theoretical computer science style reduction approach.


Review for NeurIPS paper: Escaping Saddle-Point Faster under Interpolation-like Conditions

Neural Information Processing Systems

Weaknesses: - The importance of the SGC condition remains unclear. In Line 129, the authors claimed that SGC condition is satisfied in some practical settings such as the training of deep neural networks, therefore the SGC condition should be regarded as an interesting special setting for nonconvex optimization. However, recent work [1,2] showed that the training of deep neural networks can be further regarded as a special task of convex optimization in the Neural tangent kernel (NTK) regime, which is a stronger condition than SGC. Therefore, the authors may want to clarify the importance of SGC by showing some more examples in machine learning. As the authors suggested, [VBS18] firstly studied the SGC condition under nonconvex setting and proposed that SGD costs O(1/\epsilon 2) gradient complexity to find first-order stationary points. Meanwhile, note that [AZL18] proposed a generic framework which could turn any algorithms for finding first-order stationary points into algorithms for finding approximate local minimizer, without hurting the convergence rate.