Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks

Wind, Johan S., Antun, Vegard, Hansen, Anders C.

Jul-13-2023–arXiv.org Artificial Intelligence

During the past decade, deep learning has transformed a number of historically challenging problems in computer vision, natural language processing, game intelligence, etc. In many of these applications, the trained neural networks used to solve these problems are over-parameterized. That is, the neural networks have far more parameters than the number of data points used for training. In this setting, a neural network can typically fit any training data - including random labels [95] - making it hard to explain why deep learning methods generalize so well [36]. Moreover, the practical performance of neural networks often improves as the number of parameters grow [55,84]. These observations have led to the study of the potential implicit regularization (sometimes called implicit bias) imposed by the gradient based methods and different network architectures [8, 68, 69]. It may seem surprising that there is a link to generalized hardness of approximation (GHA), as this phenomenon - at a first glance - may seem disconnected from implicit regularization. However, the GHA phenomenon (see 1.2), which first appeared in [13] (see also [2] Chapter 8) and analyzed [13, 34, 41] in connection with robust and convex optimization [20, 21, 63, 64], typically stem from regularization problems (e.g.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jul-13-2023

arXiv.org PDF

Add feedback

Country:
- Africa > Sudan (0.04)
- North America > United States
  - New York > New York County > New York City (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.28)
  - Switzerland > Basel-City
    - Basel (0.04)
  - Norway > Eastern Norway
    - Oslo (0.04)

Genre:
- Research Report > New Finding (0.45)

Industry:
- Leisure & Entertainment > Games (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found