AITopics

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Rohwer, Richard, Morciniec, Michal

The Generalisation Cost of RAMnets

Neural Information Processing SystemsDec-31-1997

We follow a similar approach to (Zhu & Rohwer, to appear 1996) in using a Gaussian process to define a prior over the space of functions, so that the expected generalisation cost under the posterior can be determined. The optimal model is defined in terms of the restriction of this posterior to the subspace defined by the model. The optimum is easily determined for linear models over a set of basis functions. We go on to compute the generalisation cost (with an error bar) for all models of this class, which we demonstrate to include the RAMnets.

formalism, generalisation cost, ramnet, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Rohwer, Richard, Morciniec, Michal

The Generalisation Cost of RAMnets

Neural Information Processing SystemsDec-31-1997

Neural Computing Research Group Aston University Aston Triangle, Birmingham B4 7ET, UK. Abstract Given unlimited computational resources, it is best to use a criterion ofminimal expected generalisation error to select a model and determine its parameters. However, it may be worthwhile to sacrifice somegeneralisation performance for higher learning speed. A method for quantifying sub-optimality is set out here, so that this choice can be made intelligently. Furthermore, the method is applicable to a broad class of models, including the ultra-fast memory-based methods such as RAMnets. This brings the added benefit of providing, for the first time, the means to analyse the generalisation properties of such models in a Bayesian framework . 1 Introduction In order to quantitatively predict the performance of methods such as the ultra-fast RAMnet, which are not trained by minimising a cost function, we develop a Bayesian formalism for estimating the generalisation cost of a wide class of algorithms.

formalism, generalisation cost, ramnet, (13 more...)

Country:

Europe > United Kingdom (0.24)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Neural Information Processing SystemsDec-31-1991

The performance of seven minimization algorithms are compared on five neural network problems. These include a variable-step-size algorithm, conjugate gradient, and several methods with explicit analytic or numerical approximations to the Hessian.

algorithm, target node, time trial, (13 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > Scotland (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Neural Information Processing SystemsDec-31-1991

The performance of seven minimization algorithms are compared on five neural network problems. These include a variable-step-size algorithm, conjugate gradient, and several methods with explicit analytic or numerical approximations to the Hessian.

algorithm, target node, time trial, (13 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > Scotland (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Neural Information Processing SystemsDec-31-1991

In 4 of these methods the gradient is divided component-wise by a decaying average of either the second derivatives or their absolute values.

algorithm, artificial intelligence, machine learning, (16 more...)

Country:

North America > United States > California (0.14)
Europe (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Krogh, Anders, Thorbergsson, C. I., Hertz, John A.

A Cost Function for Internal Representations

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation in addition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations is an indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.

algorithm, cost function, internal representation, (12 more...)

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.36)

The "Moving Targets" Training Algorithm

A simple method for training the dynamical behavior of a neural network is derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The algorithm resembles back-propagation in that an error function is minimized using a gradient-based method, but the optimization is carried out in the hidden part of state space either instead of, or in addition to weight space. Computational results are presented for some simple dynamical training problems, one of which requires response to a signal 100 time steps in the past. 1 INTRODUCTION This paper presents a minimization-based algorithm for training the dynamical behavior of a discrete-time neural network model. The central idea is to treat hidden nodes as target nodes with variable training data.

algorithm, node, rohwer, (16 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Illinois (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Krogh, Anders, Thorbergsson, C. I., Hertz, John A.

A Cost Function for Internal Representations

We introduce a cost function for learning in feed-forward neural networks which is an explicit function of the internal representation in addition to the weights. The learning problem can then be formulated as two simple perceptrons and a search for internal representations. Back-propagation is recovered as a limit. The frequency of successful solutions is better for this algorithm than for back-propagation when weights and hidden units are updated on the same timescale i.e. once every learning step. 1 INTRODUCTION In their review of back-propagation in layered networks, Rumelhart et al. (1986) describe the learning process in terms of finding good "internal representations" of the input patterns on the hidden units. However, the search for these representations is an indirect one, since the variables which are adjusted in its course are the connection weights, not the activations of the hidden units themselves when specific input patterns are fed into the input layer. Rather, the internal representations are represented implicitly in the connection weight values. More recently, Grossman et al. (1988 and 1989)1 suggested a way in which the search for internal representations could be made much more explicit.

algorithm, cost function, internal representation, (12 more...)

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.36)

The "Moving Targets" Training Algorithm

A simple method for training the dynamical behavior of a neural network is derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The algorithm resembles back-propagation in that an error function is minimized using a gradient-based method, but the optimization is carried out in the hidden part of state space either instead of, or in addition to weight space. Computational results are presented for some simple dynamical training problems, one of which requires response to a signal 100 time steps in the past. 1 INTRODUCTION This paper presents a minimization-based algorithm for training the dynamical behavior of a discrete-time neural network model. The central idea is to treat hidden nodes as target nodes with variable training data.

algorithm, node, rohwer, (16 more...)