Modern Neural Networks Generalize on Small Data Sets
Olson, Matthew, Wyner, Abraham, Berk, Richard
–Neural Information Processing Systems
In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.
Neural Information Processing Systems
Dec-31-2018
- Country:
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > Pennsylvania
- Philadelphia County > Philadelphia (0.14)
- Canada > Quebec
- North America
- Technology: