Mean-Field Analysis of Two-Layer Neural Networks: Non-Asymptotic Rates and Generalization Bounds

Chen, Zixiang, Cao, Yuan, Gu, Quanquan, Zhang, Tong

arXiv.org Machine Learning 

Deep learning has achieved tremendous practical success in a wide range of machine learning tasks (Krizhevsky et al., 2012; Hinton et al., 2012; Silver et al., 2016). However, due to the nonconvex and over-parameterized nature of modern neural networks, the success of deep learning cannot be fully explained by conventional optimization and machine learning theory. Recently, a line of work utilized a mean-field framework to study the training of extremely wide (or even infinitely wide) neural networks (Chizat and Bach, 2018; Mei et al., 2018, 2019; Wei et al., 2019; Fang et al., 2019a,b). It has been shown that over-parameterized two-layer neural networks can be trained to a global optimizer of the training loss, despite the non-convex optimization landscape. However, most of the global convergence results proved in the line are asymptotic, and the convergence rate of the training algorithm is largely unknown, except for some specifically designed training procedure (Wei et al., 2019). Moreover, the generalization performance of neural networks trained in the mean-field regime has not been well-studied. Compared with the mean-field analysis, another line of work studying the learning of overparameterized neural network in the so-called "neural tangnet kernel (NTK) regime" (Jacot et al.,

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found