On the Convergence of Gradient Descent Training for Two-layer ReLU-networks in the Mean Field Regime

May-27-2020–arXiv.org Machine Learning

We describe a necessary and sufficient condition for the convergence to minimum Bayes risk when training two-layer ReLU-networks by gradient descent in the mean field regime with omni-directional initial parameter distribution. This article extends recent results of Chizat and Bach to ReLU-activated networks and to the situation in which there are no parameters which exactly achieve MBR. The condition does not depend on the initalization of parameters and concerns only the weak convergence of the realization of the neural network, not its parameter distribution.

artificial intelligence, gradient flow, machine learning, (17 more...)

arXiv.org Machine Learning

May-27-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Indiana (0.04)
  - New Jersey > Mercer County
    - Princeton (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - China (0.04)
  - Middle East
    - Jordan (0.04)
    - Israel (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.72)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found