Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization

Wang, Xiao, Ma, Shiqian, Goldfarb, Donald, Liu, Wei

May-21-2017–arXiv.org Machine Learning

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that noisy information about the gradients of the objective function is available via a stochastic first-order oracle (SFO). We propose a general framework for such methods, for which we prove almost sure convergence to stationary points and analyze its worst-case iteration complexity. When a randomly chosen iterate is returned as the output of such an algorithm, we prove that in the worst-case, the SFO-calls complexity is $O(\epsilon^{-2})$ to ensure that the expectation of the squared norm of the gradient is smaller than the given accuracy tolerance $\epsilon$. We also propose a specific algorithm, namely a stochastic damped L-BFGS (SdLBFGS) method, that falls under the proposed framework. {Moreover, we incorporate the SVRG variance reduction technique into the proposed SdLBFGS method, and analyze its SFO-calls complexity. Numerical results on a nonconvex binary classification problem using SVM, and a multiclass classification problem using neural networks are reported.

artificial intelligence, machine learning, sdlbfgs, (19 more...)

arXiv.org Machine Learning

May-21-2017

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.28)
- North America > United States (0.28)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Mathematical & Statistical Methods (1.00)
  - Machine Learning > Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found