AITopics

It is shown that conventional computers can be exponentiallx faster than planar Hopfield networks: although there are planar Hopfield networks that take exponential time to converge, a stable state of an arbitrary planar Hopfield network can be found by a conventional computer in polynomial time.

graph, hopfield network, vertex, (13 more...)

Country: North America > United States > Texas > Denton County > Denton (0.04)

Genre: Research Report (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Removing Noise in On-Line Search using Adaptive Batch Sizes

Orr, Genevieve B.

Stochastic (online) learning can be faster than batch learning. However, at late times, the learning rate must be annealed to remove the noise present in the stochastic weight updates. In this annealing phase, the convergence rate (in mean square) is at best proportional to l/T where T is the number of input presentations. An alternative is to increase the batch size to remove the noise. In this paper we explore convergence for LMS using 1) small but fixed batch sizes and 2) an adaptive batch size. We show that the best adaptive batch schedule is exponential and has a rate of convergence which is the same as for annealing, Le., at best proportional to l/T.

batch size, equation, simulation, (10 more...)

Country:

North America > United States > California (0.14)
North America > United States > Oregon > Marion County > Salem (0.04)
North America > United States > New Jersey (0.04)

Industry: Education > Educational Setting > Online (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Opper, Manfred, Winther, Ole

A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks

In the Bayes approach to statistical inference [Berger, 1985] one assumes that the prior uncertainty about parameters of an unknown data generating mechanism can be encoded in a probability distribution, the so called prior. Using the prior and the likelihood of the data given the parameters, the posterior distribution of the parameters can be derived from Bayes rule. From this posterior, various estimates for functions ofthe parameter, like predictions about unseen data, can be calculated. However, in general, those predictions cannot be realised by specific parameter values, but only by an ensemble average over parameters according to the posterior probability. Hence, exact implementations of Bayes method for neural networks require averages over network parameters which in general can be performed by time consuming 226 M. Opper and O. Winther Monte Carlo procedures.

approximation, equation, neural network, (14 more...)

Country:

North America > United States > New York (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Maass, Wolfgang, Orponen, Pekka

On the Effect of Analog Noise in Discrete-Time Analog Computations

We introduce a model for noise-robust analog computations with discrete time that is flexible enough to cover the most important concrete cases, such as computations in noisy analog neural nets and networks of noisy spiking neurons. We show that the presence of arbitrarily small amounts of analog noise reduces the power of analog computational models to that of finite automata, and we also prove a new type of upper bound for the VC-dimension of computational models with analog noise. 1 Introduction Analog noise is a serious issue in practical analog computation. However there exists no formal model for reliable computations by noisy analog systems which allows us to address this issue in an adequate manner. The investigation of noise-tolerant digital computations in the presence of stochastic failures of gates or wires had been initiated by [von Neumann, 1956]. We refer to [Cowan, 1966] and [Pippenger, 1989] for a small sample of the nllmerous results that have been achieved in this direction. The same framework (with stochastic failures of gates or wires) hac; been applied to analog neural nets in [Siegelmann, 1994].

analog noise, computation, neural net, (13 more...)

Country:

Europe > Austria > Styria > Graz (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Ohio (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Noisy Spiking Neurons with Temporal Coding have more Computational Power than Sigmoidal Neurons

Maass, Wolfgang

Furthermore it is shown that networks of noisy spiking neurons with temporal coding have a strictly larger computational power than sigmoidal neural nets with the same number of units. 1 Introduction and Definitions We consider a formal model SNN for a §piking neuron network that is basically a reformulation of the spike response model (and of the leaky integrate and fire model) without using 6-functions (see [Maass, 1996a] or [Maass, 1996b] for further backgrou nd).

neuron, sigmoidal neural, sigmoidal neural net, (13 more...)

Country:

Europe > Austria > Styria > Graz (0.05)
North America > United States > Ohio (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Littlestone, Nick, Mesterharm, Chris

An Apobayesian Relative of Winnow

We study a mistake-driven variant of an online Bayesian learning algorithm (similar to one studied by Cesa-Bianchi, Helmbold, and Panizza [CHP96]). This variant only updates its state (learns) on trials in which it makes a mistake. The algorithm makes binary classifications using a linear-threshold classifier and runs in time linear in the number of attributes seen by the learner. We have been able to show, theoretically and in simulations, that this algorithm performs well under assumptions quite different from those embodied in the prior of the original Bayesian algorithm. It can handle situations that we do not know how to handle in linear time with Bayesian algorithms. We expect our techniques to be useful in deriving and analyzing other apobayesian algorithms. 1 Introduction We consider two styles of online learning.

algorithm, apobayesian algorithm, sasb, (14 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)

Krzyzak, Adam, Linder, Tamás

Radial Basis Function Networks and Complexity Regularization in Function Learning

In this paper we apply the method of complexity regularization to derive estimation bounds for nonlinear function estimation using a single hidden layer radial basis function network.

function network, neural network, radial basis function network, (10 more...)

Country:

Europe > Hungary > Budapest > Budapest (0.05)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Kowalczyk, Adam, Ferrá, Herman L.

MLP Can Provably Generalize Much Better than VC-bounds Indicate

It is also shown that bounds following the true learning curve can be derived from a formalism based on the density of error patterns.

perceptron, sequence, thermodynamic limit, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Oceania > Australia (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.33)

Kang, Kukjin, Oh, Jong-Hoon

Statistical Mechanics of the Mixture of Experts

The mixture of experts [1, 2] is a well known example which implements the philosophy of divide-and-conquer elegantly. Whereas this model are gaining more popularity in various applications, there have been little efforts to evaluate generalization capability of these modular approaches theoretically. Here we present the first analytic study of generalization in the mixture of experts from the statistical 184 K. Kang and 1. Oh physics perspective. Use of statistical mechanics formulation have been focused on the study of feedforward neural network architectures close to the multilayer perceptron[5, 6], together with the VC theory[8]. We expect that the statistical mechanics approach can also be effectively used to evaluate more advanced architectures including mixture models.

phase transition, statistical mechanics, symmetry, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Practical Confidence and Prediction Intervals

Heskes, Tom

We propose a new method to compute prediction intervals. Especially for small data sets the width of a prediction interval does not only depend on the variance of the target distribution, but also on the accuracy of our estimator of the mean of the target, i.e., on the width of the confidence interval. The confidence interval follows from the variation in an ensemble of neural networks, each of them trained and stopped on bootstrap replicates of the original data set. A second improvement is the use of the residuals on validation patterns instead of on training patterns for estimation of the variance of the target distribution. As illustrated on a synthetic example, our method is better than existing methods with regard to extrapolation and interpolation in data regimes with a limited amount of data, and yields prediction intervals which actual confidence levels are closer to the desired confidence levels. 1 STATISTICAL INTERVALS In this paper we will consider feedforward neural networks for regression tasks: estimating an underlying mathematical function between input and output variables based on a finite number of data points possibly corrupted by noise.

confidence interval, prediction interval, variance, (16 more...)

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)