Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

Apr-6-2023, 16:54:15 GMT–Neural Information Processing Systems

The conventional wisdom is that backprop nets with excess hidden units generalize poorly. We show that nets with excess capacity generalize well when trained with backprop and early stopping. Experiments sug(cid:173) gest two reasons for this: 1) Overfitting can vary significantly in different regions of the model. Excess capacity allows better fit to regions of high non-linearity, and backprop often avoids overfitting the regions of low non-linearity. Big nets pass through stages similar to those learned by smaller nets.

backpropagation, conjugate gradient, overfitting, (2 more...)

Neural Information Processing Systems

Apr-6-2023, 16:54:15 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)