Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks

May-28-2021–arXiv.org Machine Learning

Artificial neural networks (ANNs) are successfully trained when they are finely tuned via the optimization of their associated loss functions. Two aspects of such optimization tasks pose significant challenges, namely the non-convex nature of loss functions and the highly nonlinear features of many types of ANNs. Moreover, the analysis in Lovas et al. [2020] shows that the gradients of such non-convex loss functions typically grow faster than linearly and are only locally Lipschitz continuous. Naturally, stability issues are observed, which are known as the'exploding gradient' phenomenon (Bengio et al. [1994] and Pascanu et al. [2013]), when vanilla stochastic gradient descent (SGDs) or certain types of adaptive algorithms are used for fine tuning. Section 2 provides a simple but transparent example as to why this phenomenon is observed, even when some of the most popular adaptive algorithms are employed. One further notes that occurrences of vanishing gradients are often reported in the ANNs literature (Zhang et al. [2018] and Pascanu et al. [2013]). This phenomenon seems to particularly affect the performance of TUSLA (Lovas et al. [2020]) in our experiments when comparison is made with other popular algorithms such as AdaGrad (Duchi et al. [2011]), RMSProp (Tieleman and Hinton [2012]), ADAM (Kingma and Ba [2015]) and AMSGrad (Reddi et al. [2018]). This is observed despite TUSLA's stability properties which successfully control any potential'exploding gradient' occurrences.

algorithm, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

May-28-2021

arXiv.org PDF

Add feedback

Country:
- Europe (0.14)
- North America (0.14)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Banking & Finance > Insurance (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (1.00)
  - Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found