Non-approximability of constructive global $\mathcal{L}^2$ minimizers by gradient descent in Deep Learning

Nov-12-2023–arXiv.org Machine Learning

We analyze geometric aspects of the gradient descent algorithm in Deep Learning (DL) networks. In particular, we prove that the globally minimizing weights and biases for the $\mathcal{L}^2$ cost obtained constructively in [Chen-Munoz Ewald 2023] for underparametrized ReLU DL networks can generically not be approximated via the gradient descent flow. We therefore conclude that the method introduced in [Chen-Munoz Ewald 2023] is disjoint from the gradient descent method.

artificial intelligence, machine learning, minimizer, (14 more...)

arXiv.org Machine Learning

Nov-12-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > Texas > Travis County > Austin (0.15)

Genre:
- Research Report (0.51)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.72)
  - Statistical Learning > Gradient Descent (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found