Temporal Dynamics of Generalization in Neural Networks

Dec-31-1995–Neural Information Processing Systems

This paper presents a rigorous characterization of how a general nonlinear learning machine generalizes during the training process when it is trained on a random sample using a gradient descent algorithm based on reduction of training error. It is shown, in particular, that best generalization performance occurs, in general, before the global minimum of the training error is achieved. The different roles played by the complexity of the machine class and the complexity of the specific machine in the class during learning are also precisely demarcated. 1 INTRODUCTION In learning machines such as neural networks, two major factors that affect the'goodness of fit' of the examples are network size (complexity) and training time. These are also the major factors that affect the generalization performance of the network. Many theoretical studies exploring the relation between generalization performance and machine complexity support the parsimony heuristics suggested by Occam's razor, towit that amongst machines with similar training performance one should opt for the machine of least complexity.

artificial intelligence, generalization error, neural network, (15 more...)

Neural Information Processing Systems

Dec-31-1995

Conferences PDF

Add feedback

Country:
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.15)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
Temporal Dynamics of Generalization in Neural Networks
Temporal Dynamics of Generalization in Neural Networks

Similar Docs Excel Report more

Title	Similarity	Source
None found