Quadratic models for understanding neural network dynamics

Zhu, Libin, Liu, Chaoyue, Radhakrishnan, Adityanarayanan, Belkin, Mikhail

arXiv.org Artificial Intelligence 

A recent remarkable finding on neural networks, originating from [9] and termed as the "transition to linearity" [16], is that, as network width goes to infinity, such models become linear functions in the parameter space. Thus, a linear (in parameters) model can be built to accurately approximate wide neural networks under certain conditions. While this finding has helped improve our understanding of trained neural networks [4, 20, 29, 18, 11, 3], not all properties of finite width neural networks can be understood in terms of linear models, as is shown in several recent works [27, 21, 17, 6]. In this work, we show that properties of finitely wide neural networks in optimization and generalization that cannot be captured by linear models are, in fact, manifested in quadratic models.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found