Not enough data to create a plot.
Try a different view from the menu above.
Ahmad, Subutai
Asymptotic Convergence of Backpropagation: Numerical Experiments
Ahmad, Subutai, Tesauro, Gerald, He, Yu
We have calculated, both analytically and in simulations, the rate of convergence at long times in the backpropagation learning algorithm for networks with and without hidden units. Our basic finding for units using the standard sigmoid transfer function is lit convergence of the error for large t, with at most logarithmic corrections for networks with hidden units. Other transfer functions may lead to a 8lower polynomial rate of convergence. Our analytic calculations were presented in (Tesauro, He & Ahamd, 1989). Here we focus in more detail on our empirical measurements of the convergence rate in numerical simulations, which confirm our analytic results.
Scaling and Generalization in Neural Networks: A Case Study
Ahmad, Subutai, Tesauro, Gerald
The issues of scaling and generalization have emerged as key issues in current studies of supervised learning from examples in neural networks. Questions such as how many training patterns and training cycles are needed for a problem of a given size and difficulty, how to represent the inllUh and how to choose useful training exemplars, are of considerable theoretical and practical importance. Several intuitive rules of thumb have been obtained from empirical studies, but as yet there are few rigorous results.In this paper we summarize a study Qf generalization in the simplest possible case-perceptron networks learning linearly separable functions.The task chosen was the majority function (i.e. return a 1 if a majority of the input units are on), a predicate with a number ofuseful properties. We find that many aspects of.generalization in multilayer networks learning large, difficult tasks are reproduced in this simple domain, in which concrete numerical results and even some analytic understanding can be achieved.
Scaling and Generalization in Neural Networks: A Case Study
Ahmad, Subutai, Tesauro, Gerald
The issues of scaling and generalization have emerged as key issues in current studies of supervised learning from examples in neural networks. Questions such as how many training patterns and training cycles are needed for a problem of a given size and difficulty, how to represent the inllUh and how to choose useful training exemplars, are of considerable theoretical and practical importance. Several intuitive rules of thumb have been obtained from empirical studies, but as yet there are few rigorous results. In this paper we summarize a study Qf generalization in the simplest possible case-perceptron networks learning linearly separable functions. The task chosen was the majority function (i.e. return a 1 if a majority of the input units are on), a predicate with a number of useful properties. We find that many aspects of.generalization in multilayer networks learning large, difficult tasks are reproduced in this simple domain, in which concrete numerical results and even some analytic understanding can be achieved.