Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent
Bai, Qinbo, Agarwal, Mridul, Aggarwal, Vaneet
Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the dire ct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stati onary properties, the second-order stationary properties are important in machine learning applications to achieve b etter performance. Gradient descent and its variants (e.g., Stochastic Gradie nt Descent) are widely used in machine learning due to their favorable computational properties, for examp le, in optimizing weights of a deep neural network. Recently, second order stationary guarant ees have been studied by using a perturbed version of gradient de scent [2].
Oct-2-2019
- Country:
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Indiana > Tippecanoe County
- Lafayette (0.04)
- West Lafayette (0.04)
- Indiana > Tippecanoe County
- Asia > Middle East
- Genre:
- Research Report (0.82)
- Technology: