AITopics

By an extension of statistical mechanics methods, we obtain exact results for the time-dependent generalization error of a linear network with a large number of weights N. We find, for example, that for small training sets of size p N, larger learning rates can be used without compromising asymptotic generalization performance or convergence speed. Encouragingly, for optimal settings of TJ (and, less importantly, weight decay,\) at given final learning time, the generalization performance of online learning is essentially as good as that of offline learning.

generalization error, learning, weight decay, (14 more...)

Country: Europe > United Kingdom (0.05)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.66)

Saul, Lawrence K., Jordan, Michael I.

A Variational Principle for Model-based Morphing

Given a multidimensional data set and a model of its density, we consider how to define the optimal interpolation between two points. This is done by assigning a cost to each path through space, based on two competing goals-one to interpolate through regions of high density, the other to minimize arc length. From this path functional, we derive the Euler-Lagrange equations for extremal motionj given two points, the desired interpolation is found by solving a boundary value problem. We show that this interpolation can be done efficiently, in high dimensions, for Gaussian, Dirichlet, and mixture models. 1 Introduction The problem of nonlinear interpolation arises frequently in image, speech, and signal processing. Consider the following two examples: (i) given two profiles of the same face, connect them by a smooth animation of intermediate poses[l]j (ii) given a telephone signal masked by intermittent noise, fill in the missing speech.

boundary value problem, interpolant, interpolation, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Rohwer, Richard, Morciniec, Michal

The Generalisation Cost of RAMnets

We follow a similar approach to (Zhu & Rohwer, to appear 1996) in using a Gaussian process to define a prior over the space of functions, so that the expected generalisation cost under the posterior can be determined. The optimal model is defined in terms of the restriction of this posterior to the subspace defined by the model. The optimum is easily determined for linear models over a set of basis functions. We go on to compute the generalisation cost (with an error bar) for all models of this class, which we demonstrate to include the RAMnets.

formalism, generalisation cost, ramnet, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Peper, Ferdinand, Noda, Hideki

Hebb Learning of Features based on their Information Content

This paper investigates the stationary points of a Hebb learning rule with a sigmoid nonlinearity in it. We show mathematically that when the input has a low information content, as measured by the input's variance, this learning rule suppresses learning, that is, forces the weight vector to converge to the zero vector. When the information content exceeds a certain value, the rule will automatically begin to learn a feature in the input. Our analysis suggests that under certain conditions it is the first principal component that is learned. The weight vector length remains bounded, provided the variance of the input is finite. Simulations confirm the theoretical results derived.

information content, neuron, stationary point, (14 more...)

Country:

Asia > Japan > Shikoku > Kōchi Prefecture > Kochi (0.05)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.05)
Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Parberry, Ian, Tseng, Hung-Li

Are Hopfield Networks Faster than Conventional Computers?

It is shown that conventional computers can be exponentiallx faster than planar Hopfield networks: although there are planar Hopfield networks that take exponential time to converge, a stable state of an arbitrary planar Hopfield network can be found by a conventional computer in polynomial time.

graph, hopfield network, vertex, (13 more...)

Country: North America > United States > Texas > Denton County > Denton (0.04)

Genre: Research Report (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Removing Noise in On-Line Search using Adaptive Batch Sizes

Orr, Genevieve B.

Stochastic (online) learning can be faster than batch learning. However, at late times, the learning rate must be annealed to remove the noise present in the stochastic weight updates. In this annealing phase, the convergence rate (in mean square) is at best proportional to l/T where T is the number of input presentations. An alternative is to increase the batch size to remove the noise. In this paper we explore convergence for LMS using 1) small but fixed batch sizes and 2) an adaptive batch size. We show that the best adaptive batch schedule is exponential and has a rate of convergence which is the same as for annealing, Le., at best proportional to l/T.

batch size, equation, simulation, (10 more...)

Country:

North America > United States > California (0.14)
North America > United States > Oregon > Marion County > Salem (0.04)
North America > United States > New Jersey (0.04)

Industry: Education > Educational Setting > Online (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Opper, Manfred, Winther, Ole

A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks

In the Bayes approach to statistical inference [Berger, 1985] one assumes that the prior uncertainty about parameters of an unknown data generating mechanism can be encoded in a probability distribution, the so called prior. Using the prior and the likelihood of the data given the parameters, the posterior distribution of the parameters can be derived from Bayes rule. From this posterior, various estimates for functions ofthe parameter, like predictions about unseen data, can be calculated. However, in general, those predictions cannot be realised by specific parameter values, but only by an ensemble average over parameters according to the posterior probability. Hence, exact implementations of Bayes method for neural networks require averages over network parameters which in general can be performed by time consuming 226 M. Opper and O. Winther Monte Carlo procedures.

approximation, equation, neural network, (14 more...)

Country:

North America > United States > New York (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Krzyzak, Adam, Linder, Tamás

Radial Basis Function Networks and Complexity Regularization in Function Learning

In this paper we apply the method of complexity regularization to derive estimation bounds for nonlinear function estimation using a single hidden layer radial basis function network.

function network, neural network, radial basis function network, (10 more...)

Country:

Europe > Hungary > Budapest > Budapest (0.05)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Littlestone, Nick, Mesterharm, Chris

An Apobayesian Relative of Winnow

We study a mistake-driven variant of an online Bayesian learning algorithm (similar to one studied by Cesa-Bianchi, Helmbold, and Panizza [CHP96]). This variant only updates its state (learns) on trials in which it makes a mistake. The algorithm makes binary classifications using a linear-threshold classifier and runs in time linear in the number of attributes seen by the learner. We have been able to show, theoretically and in simulations, that this algorithm performs well under assumptions quite different from those embodied in the prior of the original Bayesian algorithm. It can handle situations that we do not know how to handle in linear time with Bayesian algorithms. We expect our techniques to be useful in deriving and analyzing other apobayesian algorithms. 1 Introduction We consider two styles of online learning.

algorithm, apobayesian algorithm, sasb, (14 more...)