On the Minimal Error of Empirical Risk Minimization

Feb-23-2021–arXiv.org Machine Learning

An increasing number of machine learning applications employ flexible overparameterized models to fit the training data. Theoretical analysis of such'overfitted' solutions has been a recent focus of the learning community. It is conjectured that the use of large overparameterized neural networks makes the loss landscape amenable to optimization through local search methods, such as stochastic gradient descent. It is also hypothesized that implicit regularization, arising from the choice of the optimization algorithm and the neural network architecture, mitigates the large complexity and ensures that the'overfitted' solutions generalize. Suppose a'simple' class H of models captures the relationship between the covariates X and the response variable Y. Inspired by the use of overparameterized models, we may take a much larger class F H for computational or other purposes (such as lack of explicit description of H) and minimize training loss over this larger class.

artificial intelligence, neural network, probability, (18 more...)

arXiv.org Machine Learning

Feb-23-2021

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (0.88)
    - Statistical Learning > Gradient Descent (0.54)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found