Structured Stochastic Quasi-Newton Methods for Large-Scale Optimization Problems

Yang, Minghan, Xu, Dong, Li, Yongfeng, Wen, Zaiwen, Chen, Mengyun

Jun-16-2020–arXiv.org Machine Learning

In this paper, we consider large-scale finite-sum nonconvex problems arising from machine learning. Since the Hessian is often a summation of a relative cheap and accessible part and an expensive or even inaccessible part, a stochastic quasi-Newton matrix is constructed using partial Hessian information as much as possible. By further exploiting the low-rank structures based on the Nystr\"om approximation, the computation of the quasi-Newton direction is affordable. To make full use of the gradient estimation, we also develop an extra-step strategy for this framework. Global convergence to stationary point in expectation and local suplinear convergence rate are established under some mild assumptions. Numerical experiments on logistic regression, deep autoencoder networks and deep learning problems show that the efficiency of our proposed method is at least comparable with the state-of-the-art methods.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

Jun-16-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report > New Finding (0.66)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found