Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis

Jun-4-2021–arXiv.org Machine Learning

The representation of functions by artificial neural networks depends on a large number of parameters in a non-linear fashion. Suitable parameters of these are found by minimizing a 'loss functional', typically by stochastic gradient descent (SGD) or an advanced SGD-based algorithm. In a continuous time model for SGD with noise that follows the 'machine learning scaling', we show that in a certain noise regime, the optimization algorithm prefers 'flat' minima of the objective function in a sense which is different from the flat minimum selection of continuous time SGD with homogeneous noise.

neural network, noise, upstream oil & gas, (15 more...)

arXiv.org Machine Learning

Jun-4-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.45)

Genre:
- Research Report (0.81)

Industry:
- Energy > Oil & Gas > Upstream (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (1.00)
  - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found