A Stochastic Quasi-Newton Method with Nesterov's Accelerated Gradient

Indrapriyadarsini, S., Mahboubi, Shahrzad, Ninomiya, Hiroshi, Asai, Hideki

Sep-8-2019–arXiv.org Machine Learning

Incorporating second order curvature information in gradient based methods have shown to improve convergence drastically despite its computational intensity. In this paper, we propose a stochastic (online) quasi-Newton method with Nesterov's accelerated gradient in both its full and limited memory forms for solving large scale non-convex optimization problems in neural networks. The performance of the proposed algorithm is evaluated in Tensorflow on benchmark classification and regression problems. The results show improved performance compared to the classical second order oBFGS and oLBFGS methods and popular first order stochastic methods such as SGD and Adam. The performance with different momentum rates and batch sizes have also been illustrated. Keywords: Neural networks · stochastic method · online training · Nesterov's accelerated gradient · quasi-Newton method · limited memory · Tensorflow 1 Introduction Neural networks have shown to be effective in innumerous real-world applications.

algorithm, nesterov, quasi-newton method, (16 more...)

arXiv.org Machine Learning

Sep-8-2019

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Toronto (0.14)
- Asia
  - Middle East > Jordan (0.04)
  - Japan > Honshū
    - Kantō > Kanagawa Prefecture (0.04)
    - Chūbu > Shizuoka Prefecture
      - Shizuoka (0.04)

Genre:
- Research Report (0.70)

Industry:
- Education
  - Educational Setting > Online (0.86)
  - Educational Technology > Educational Software
    - Computer Based Training (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Neural Networks (1.00)
    - Statistical Learning > Gradient Descent (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found