Generalization Ability of Wide Neural Networks on $\mathbb{R}$
Lai, Jianfa, Xu, Manyun, Chen, Rui, Lin, Qian
–arXiv.org Artificial Intelligence
Deep neural networks have been successfully applied in various fields such as image analysis, natural language processing, protein structure prediction, etc.[40, 22, 35]. Since the number of parameters appeared in deep neural networks is often ten times or hundred times larger than the sample size of data, the successes of neural network methods have challenged the traditional bias variances trade-off principle, one of the primary doctrines in the classical statistical learning theories [61]. For example, many influential experiments [9, 67, 8, 48, 7] suggested that if one trains a neural network till it overfits the data, the resulting network can still generalize well. This observation, often referred to as the "benign overfitting phenomenon" [4, 53, 26, 45], actually reshaped the landscape of the studies in neural networks. For example, some researchers built giant neural networks in practice which can easily achieve nearly zero training error and possess the state-of-the-art performances [31, 50, 21]. Inspired by these experiments and observations, researchers proposed various new theories to explain why overfitted neural networks do generalize well on certain data [9, 43, 26, 47]. Several groups of statisticians tried to explain the generalization ability of neural networks from statistical decision theory with various carefully designed nonparametric regression frameworks. For example, assuming that the regression function belongs to a carefully designed sub-class of the Hölder continuous functions, [5] proved that there exists a neural network with sigmoid activation function achieving the corresponding minimax rate; [54] further established similar results for ReLU neural networks based on the approximation theory from [66]; [59] then extended these results to regression functions in Besov space and its variants.
arXiv.org Artificial Intelligence
Feb-12-2023
- Country:
- North America > United States
- Rhode Island > Providence County
- Providence (0.04)
- New York > New York County
- New York City (0.04)
- Rhode Island > Providence County
- Europe > United Kingdom
- England
- Oxfordshire > Oxford (0.04)
- Cambridgeshire > Cambridge (0.04)
- England
- Asia > China
- North America > United States
- Genre:
- Research Report (0.81)
- Technology: