AITopics

For general networks with loops, the situation is much less clear. On the one hand, a number of researchers have empirically demonstrated good performance for BP algorithms applied to networks with loops. One dramatic case is the near Shannon-limit performance of "Turbo codes", whose decoding algorithm is equivalent to BP on a loopy network [2, 6]. For some problems in computer vision involving networks with loops, BP has also shown to be accurate and to converge very quickly [2, 1, 7]. On the other hand, for other networks with loops, BP may give poor results or fail to converge [7]. For a general graph, little has been understood about what approximation BP represents, and how it might be improved. This paper's goal is to provide that understanding and introduce a set of new algorithms resulting from that understanding. We show that BP is the first in a progression of local message-passing algorithms, each giving equivalent results to a corresponding approximation from statistical physics known as the "Kikuchi" approximation to the Gibbs free energy. These algorithms have the attractive property of being user-adjustable: by paying some additional computational cost, one can obtain considerable improvement in the accuracy of one's approximation, and can sometimes obtain a convergent message-passing algorithm when ordinary BP does not converge.

approximation, artificial intelligence, belief revision, (19 more...)

Country: North America > United States (0.14)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Caruana, Rich, Lawrence, Steve, Giles, C. Lee

Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping

The conventional wisdom is that backprop nets with excess hidden units generalize poorly. We show that nets with excess capacity generalize well when trained with backprop and early stopping. Experiments suggest two reasons for this: 1) Overfitting can vary significantly in different regions of the model. Excess capacity allows better fit to regions of high non-linearity, and backprop often avoids overfitting the regions of low non-linearity.

artificial intelligence, generalization, neural network, (14 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Energy > Oil & Gas (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.41)

Mizutani, Eiji, Demmel, James

On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems

Our al exploits the special structure of the sum of squared error measure in Equation (1); hence, the other objective functions are outside the scope of this paper. The gradient vector and Hessian matrix are given by g g(9) JT rand H H(9) JT J S, where J is the m x n Jacobian matrix of r, and S denotes the matrix of second-derivative terms. If S is simply omitted based on the "small residual" assumption, then the Hessian matrix reduces to the Gauss-Newton model Hessian: i.e., JT J. Furthermore, a family of quasi-Newton methods can be applied to approximate term S alone, leading to the augmented Gauss-Newton model Hessian (see, for example, Mizutani [2] and references therein).

algorithm, neural network, upstream oil & gas, (16 more...)

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Industry: Energy > Oil & Gas > Upstream (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)

Højen-Sørensen, Pedro A. d. F. R., Winther, Ole, Hansen, Lars Kai

Ensemble Learning and Linear Response Theory for ICA

We propose a general Bayesian framework for performing independent (leA) which relies on ensemble learning and linearcomponent analysis response theory known from statistical physics. We apply it to both discrete and continuous sources. For the continuous source the underdetermined (overcomplete) case is studied. The naive mean-field approach fails in this case whereas linear response theory-which gives an improved estimate of covariances-is very efficient. The examples given are for sources without temporal correlations. However, this derivation can easily to treat temporal correlations. Finally, the frameworkbe extended of generating new leA algorithms without needingoffers a simple way to define the prior distribution of the sources explicitly.

artificial intelligence, bayesian inference, machine learning, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Europe > Sweden > Skåne County > Lund (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Scarpetta, Silvia, Li, Zhaoping, Hertz, John A.

Spike-Timing-Dependent Learning for Oscillatory Networks

The model structure is an abstrac- tion of the hippocampus or the olfactory cortex. We propose a simple generalized Hebbian rule, using temporal-activity-dependent LTP and LTD, to encode both magnitudes and phases of oscillatory patterns into the synapses in the network. After learning, the model responds resonantly to inputs which have been learned (or, for networks which operate essentially linearly, to linear combinations of learned inputs), but negligibly to other input patterns. Encoding both amplitude and phase enhances computational capacity, for which the price is having to learn both the excitatory-to-excitatory and the excitatory-to-inhibitory connections. Our model puts contraints on the form of the learning kernal A(r) that should be experimenally observed, e.g., for small oscillation frequencies, it requires that the overall LTP dominates the overall LTD, but this requirement should be modified if the stored oscillations are of high frequencies.

frequency, machine learning, pattern recognition, (19 more...)

Country:

Europe > Italy (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Industry: Health & Medicine (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Boyan, Justin A., Littman, Michael L.

Exact Solutions to Time-Dependent MDPs

Running times are the median of five runs on an UltraSparc II (296MHz CPU, 256Mb RAM).

artificial intelligence, planning & scheduling, time-value function, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Industry:

Government > Space Agency (0.47)
Government > Regional Government > North America Government > United States Government (0.47)
Transportation > Ground (0.30)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.47)

Hayton, Paul M., Schölkopf, Bernhard, Tarassenko, Lionel, Anuzis, Paul

Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra

A system has been developed to extract diagnostic information from jet engine carcass vibration data. Support Vector Machines applied to novelty detection provide a measure of how unusual the shape of a vibration signature is, by learning a representation of normality. We describe a novel method for Support Vector Machines of including information from a second class for novelty detection and give results from the application to Jet Engine vibration analysis.

data mining, engine, machine learning, (13 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
North America > United States > Washington > King County > Redmond (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report (0.49)

Industry: Aerospace & Defense (0.83)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Naphade, Milind R., Kozintsev, Igor, Huang, Thomas S.

Probabilistic Semantic Video Indexing

We propose a novel probabilistic framework for semantic video indexing. We define probabilistic multimedia objects (multijects) to map low-level media features to high-level semantic labels.

artificial intelligence, machine learning, multiject, (16 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Malzahn, Dörthe, Opper, Manfred

Learning Curves for Gaussian Processes Regression: A Framework for Good Approximations

Based on a statistical mechanics approach, we develop a method for approximately computing average case learning curves for Gaussian process regression models. The approximation works well in the large sample size limit and for arbitrary dimensionality of the input space. We explain how the approximation can be systematically improved and argue that similar techniques can be applied to general likelihood models. 1 Introduction Gaussian process (GP) models have gained considerable interest in the Neural Computation Community (see e.g.[I, 2, 3, 4]) in recent years. Being nonparametric models by construction their theoretical understanding seems to be less well developed compared to simpler parametric models like neural networks. We are especially interested in developing theoretical approaches which will at least give good approximations to generalization errors when the number of training data is sufficiently large. In this paper we present a step in this direction which is based on a statistical mechanics approach.

approximation, artificial intelligence, machine learning, (15 more...)

Country:

Europe > United Kingdom (0.14)
Asia > Middle East > Jordan (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Arleo, Angelo, Smeraldi, Fabrizio, Hug, Stéphane, Gerstner, Wulfram

Place Cells and Spatial Navigation Based on 2D Visual Feature Extraction, Path Integration, and Reinforcement Learning

Visual input, provided by a video camera on a miniature robot, is preprocessed by a set of Gabor filters on 31 nodes of a log-polar retinotopic graph. Unsupervised Hebbian learning is employed to incrementally build a population of localized overlapping place fields. Place cells serve as basis functions for reinforcement learning. Experimental results for goal-oriented navigation of a mobile robot are presented.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland > Vaud > Lausanne (0.05)
(3 more...)

Industry: Media > Television (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)