AITopics

Country:

North America > Canada (0.15)
North America > United States (0.14)

Genre: Research Report (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Bakker, Rembrandt, Schouten, Jaap C., Coppens, Marc-Olivier, Takens, Floris, Giles, C. Lee, Bleek, Cor M. van den

Robust Learning of Chaotic Attractors

A fundamental problem with the modeling of chaotic time series data is that minimizing short-term prediction errors does not guarantee a match between the reconstructed attractors of model and experiments. We introduce a modeling paradigm that simultaneously learns to short-tenn predict and to locate the outlines of the attractor by a new way of nonlinear principal component analysis. Closed-loop predictions are constrained to stay within these outlines, to prevent divergence from the attractor. Learning is exceptionally fast: parameter estimation for the 1000 sample laser data from the 1991 Santa Fe time series competition took less than a minute on a 166 MHz Pentium PC.

artificial intelligence, attractor, machine learning, (17 more...)

Country: Europe > Netherlands (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Zemel, Richard S., Mozer, Michael C.

A Generative Model for Attractor Dynamics

Attractor networks, which map an input space to a discrete output space, are useful for pattern completion. However, designing a net to have a given set of attractors is notoriously tricky; training procedures are CPU intensive and often produce spurious afuactors and ill-conditioned attractor basins. These difficulties occur because each connection in the network participates in the encoding of multiple attractors. We describe an alternative formulation of attractor networks in which the encoding of knowledge is local, not distributed. Although localist attractor networks have similar dynamics to their distributed counterparts, they are much easier to work with and interpret.

artificial intelligence, attractor, machine learning, (16 more...)

Country:

North America > United States > Colorado (0.15)
North America > United States > Arizona (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.42)

Bayesian Averaging is Well-Temperated

Hansen, Lars Kai

Often a learning problem has natural quantitative measure of generalization. If a loss function is defined the natural measure is the generalization error, i.e., the expected loss on a random sample independent of the training set. Generalizability is a key topic of learning theory and much progress has been reported. Analytic results for a broad class of machines can be found in the litterature [8, 12, 9, 10] describing the asymptotic generalization ability of supervised algorithms that are continuously parameterized. Asymptotic bounds on generalization for general machines have been advocated by Vapnik [11]. Generalization results valid for finite training sets can only be obtained for specific learning machines, see e.g.

artificial intelligence, generalization error, machine learning, (15 more...)

Country:

Europe (0.29)
North America > United States > New Jersey (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Constrained Hidden Markov Models

Roweis, Sam T.

By thinking of each state in a hidden Markov model as corresponding to some spatial region of a fictitious topology space it is possible to naturally define neighbouring states as those which are connected in that space. The transition matrix can then be constrained to allow transitions only between neighbours; this means that all valid state sequences correspond to connected paths in the topology space. I show how such constrained HMMs can learn to discover underlying structure in complex sequences of high dimensional data, and apply them to the problem of recovering mouth movements from acoustics in continuous speech.

artificial intelligence, machine learning, topology space, (17 more...)

Country: North America > United States > Wisconsin (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Bayesian Map Learning in Dynamic Environments

Murphy, Kevin P.

We consider the problem of learning a grid-based map using a robot with noisy sensors and actuators. We compare two approaches: online EM, where the map is treated as a fixed parameter, and Bayesian inference, where the map is a (matrix-valued) random variable. We show that even on a very simple example, online EM can get stuck in local minima, which causes the robot to get "lost" and the resulting map to be useless. By contrast, the Bayesian approach, by maintaining multiple hypotheses, is much more robust. We then introduce a method for approximating the Bayesian solution, called Rao-Blackwellised particle filtering. We show that this approximation, when coupled with an active learning strategy, is fast but accurate.

Schoner, Holger, Stetter, Martin, Schießl, Ingo, Mayhew, John E. W., Lund, Jennifer S., McLoughlin, Niall, Obermayer, Klaus

Application of Blind Separation of Sources to Optical Recording of Brain Activity

In the analysis of data recorded by optical imaging from intrinsic signals of changes of light reflectance from cortical tissue) the removal(measurement of noise and artifacts such as blood vessel patterns is a serious problem. Often bandpass filtering is used, but the underlying assumption that a spatial frequency exists, which separates the mapping component from other components (especially the global signal), is questionable. Here we propose alternative ways of processing optical imaging data, using blind source separation techniques based on the spatial decorre1ation of the data. We first perform benchmarks on artificial data in order to select the way of processing, which is most robust with respect to sensor noise. We then apply it to recordings of optical imaging experiments BSS technique isfrom macaque primary visual cortex. We show that our able to extract ocular dominance and orientation preference maps from single condition stacks, for data, where standard post-processing procedures fail. Artifacts, especially blood vessel patterns, can often be completely removed from the maps. In summary, our method for blind source separation using extended spatial decorrelation is a superior technique for the analysis of optical recording data.

algorithm, health & medicine, neurology, (16 more...)

Country:

Europe > Germany (0.15)
North America > United States (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Platt, John C., Cristianini, Nello, Shawe-Taylor, John

Large Margin DAGs for Multiclass Classification

We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many two-class classifiers into a multiclass classifier. For an N -class problem, the DDAG contains N(N-1)/2 classifiers, one for each pair of classes. We present a VC analysis of the case when the node classifiers are hyperplanes; the resulting boundon the test error depends on N and on the margin achieved at the nodes, but not on the dimension of the space. This motivates an algorithm, DAGSVM, which operates in a kernel-induced feature space and uses two-class maximal margin hyperplanes at each decision-node of the DDAG. The DAGSVM is substantially faster to train and evaluate thaneither the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms. 1 Introduction The problem of multiclass classificatIon, especially for systems like SVMs, doesn't present an easy solution. It is generally simpler to construct classifier theory and algorithms for two mutually-exclusive classes than for N mutually-exclusive classes.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Ghahramani, Zoubin, Beal, Matthew J.

Variational Inference for Bayesian Mixtures of Factor Analysers

Zoubin Ghahramani and Matthew J. Beal Gatsby Computational Neuroscience Unit University College London 17 Queen Square, London WC1N 3AR, England {zoubin,m.beal}Ggatsby.ucl.ac.uk Abstract We present an algorithm that infers the model structure of a mixture offactor analysers using an efficient and deterministic variational approximationto full Bayesian integration over model parameters. Thisprocedure can automatically determine the optimal number of components and the local dimensionality of each component (Le. the number of factors in each factor analyser). Alternatively it can be used to infer posterior distributions over number of components and dimensionalities. Since all parameters are integrated out the method is not prone to overfitting. Using a stochastic procedure for adding components it is possible to perform thevariational optimisation incrementally and to avoid local maxima.

approximation, bayesian inference, neurology, (19 more...)

Country: Europe > United Kingdom > England (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Graepel, Thore, Herbrich, Ralf, Obermayer, Klaus

Bayesian Transduction

Transduction is an inference principle that takes a training sample andaims at estimating the values of a function at given points contained in the so-called working sample as opposed to the whole of input space for induction. Transduction provides a confidence measure on single predictions rather than classifiers - a feature particularly important for risk-sensitive applications. The possibly infinite number of functions is reduced to a finite number of equivalence classeson the working sample. A rigorous Bayesian analysis reveals that for standard classification loss we cannot benefit from considering more than one test point at a time. The probability of the label of a given test point is determined as the posterior measure of the corresponding subset of hypothesis space.

artificial intelligence, confidence measure, machine learning, (16 more...)