Quadratic models for understanding neural network dynamics

Zhu, Libin, Liu, Chaoyue, Radhakrishnan, Adityanarayanan, Belkin, Mikhail

Jun-7-2023–arXiv.org Artificial Intelligence

A recent remarkable finding on neural networks, originating from [9] and termed as the "transition to linearity" [16], is that, as network width goes to infinity, such models become linear functions in the parameter space. Thus, a linear (in parameters) model can be built to accurately approximate wide neural networks under certain conditions. While this finding has helped improve our understanding of trained neural networks [4, 20, 29, 18, 11, 3], not all properties of finite width neural networks can be understood in terms of linear models, as is shown in several recent works [27, 21, 17, 6]. In this work, we show that properties of finitely wide neural networks in optimization and generalization that cannot be captured by linear models are, in fact, manifested in quadratic models.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

Jun-7-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.93)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found