Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Bai, Qinbo, Agarwal, Mridul, Aggarwal, Vaneet

arXiv.org Machine Learning 

Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the dire ct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stati onary properties, the second-order stationary properties are important in machine learning applications to achieve b etter performance. Gradient descent and its variants (e.g., Stochastic Gradie nt Descent) are widely used in machine learning due to their favorable computational properties, for examp le, in optimizing weights of a deep neural network. Recently, second order stationary guarant ees have been studied by using a perturbed version of gradient de scent [2].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found