Rohwer, Richard
The Generalisation Cost of RAMnets
Rohwer, Richard, Morciniec, Michal
Neural Computing Research Group Aston University Aston Triangle, Birmingham B4 7ET, UK. Abstract Given unlimited computational resources, it is best to use a criterion ofminimal expected generalisation error to select a model and determine its parameters. However, it may be worthwhile to sacrifice somegeneralisation performance for higher learning speed. A method for quantifying sub-optimality is set out here, so that this choice can be made intelligently. Furthermore, the method is applicable to a broad class of models, including the ultra-fast memory-based methods such as RAMnets. This brings the added benefit of providing, for the first time, the means to analyse the generalisation properties of such models in a Bayesian framework . 1 Introduction In order to quantitatively predict the performance of methods such as the ultra-fast RAMnet, which are not trained by minimising a cost function, we develop a Bayesian formalism for estimating the generalisation cost of a wide class of algorithms.
The Generalisation Cost of RAMnets
Rohwer, Richard, Morciniec, Michal
We follow a similar approach to (Zhu & Rohwer, to appear 1996) in using a Gaussian process to define a prior over the space of functions, so that the expected generalisation cost under the posterior can be determined. The optimal model is defined in terms of the restriction of this posterior to the subspace defined by the model. The optimum is easily determined for linear models over a set of basis functions. We go on to compute the generalisation cost (with an error bar) for all models of this class, which we demonstrate to include the RAMnets.
The "Moving Targets" Training Algorithm
Rohwer, Richard
A simple method for training the dynamical behavior of a neural networkis derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The algorithm resembles back-propagation in that an error function is minimized using a gradient-based method, but the optimization is carried out in the hidden part of state space either instead of, or in addition to weight space. Computational results are presented for some simple dynamical training problems, one of which requires response to a signal 100 time steps in the past. 1 INTRODUCTION This paper presents a minimization-based algorithm for training the dynamical behavior ofa discrete-time neural network model. The central idea is to treat hidden nodes as target nodes with variable training data.
The "Moving Targets" Training Algorithm
Rohwer, Richard
A simple method for training the dynamical behavior of a neural network is derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The algorithm resembles back-propagation in that an error function is minimized using a gradient-based method, but the optimization is carried out in the hidden part of state space either instead of, or in addition to weight space. Computational results are presented for some simple dynamical training problems, one of which requires response to a signal 100 time steps in the past. 1 INTRODUCTION This paper presents a minimization-based algorithm for training the dynamical behavior of a discrete-time neural network model. The central idea is to treat hidden nodes as target nodes with variable training data.