AITopics | Country

We consider the problem of learning a grid-based map using a robot with noisy sensors and actuators. We compare two approaches: online EM, where the map is treated as a fixed parameter, and Bayesian inference, where the map is a (matrix-valued) random variable. We show that even on a very simple example, online EM can get stuck in local minima, which causes the robot to get "lost" and the resulting map to be useless. By contrast, the Bayesian approach, by maintaining multiple hypotheses, is much more robust. Wethen introduce a method for approximating the Bayesian solution, called Rao-Blackwellised particle filtering. We show that this approximation, when coupled with an active learning strategy, is fast but accurate.

Add feedback

Better Generative Models for Sequential Data Problems: Bidirectional Recurrent Mixture Density Networks

Schuster, Mike

Neural Information Processing SystemsDec-31-2000

This paper describes bidirectional recurrent mixture density networks, whichcan model multi-modal distributions of the type P(Xt Iyf) and P(Xt lXI, X2, ...,Xt-l, yf) without any explicit assumptions aboutthe use of context. These expressions occur frequently in pattern recognition problems with sequential data, for example in speech recognition. Experiments show that the proposed generativemodels give a higher likelihood on test data compared toa traditional modeling approach, indicating that they can summarize the statistical properties of the data better. 1 Introduction Many problems of engineering interest can be formulated as sequential data problems inan abstract sense as supervised learning from sequential data, where an input vector (dimensionality D) sequence X xf {X!,X2, .. .

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.41)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.34)

Add feedback

Approximate Inference A lgorithms for Two-Layer Bayesian Networks

Ng, Andrew Y., Jordan, Michael I.

Neural Information Processing SystemsDec-31-2000

We present a class of approximate inference algorithms for graphical models of the QMR-DT type. We give convergence rates for these algorithms andfor the Jaakkola and Jordan (1999) algorithm, and verify these theoretical predictions empirically.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.30)
North America > United States (0.14)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Maximum Entropy Discrimination

Jaakkola, Tommi, Meila, Marina, Jebara, Tony

Neural Information Processing SystemsDec-31-2000

We present a general framework for discriminative estimation based on the maximum entropy principle and its extensions. All calculations involvedistributions over structures and/or parameters rather than specific settings and reduce to relative entropy projections. This holds even when the data is not separable within the chosen parametric class, in the context of anomaly detection rather than classification, or when the labels in the training set are uncertain or incomplete. Support vector machines are naturally subsumed under thisclass and we provide several extensions. We are also able to estimate exactly and efficiently discriminative distributions over tree structures of class-conditional models within this framework.

Add feedback

Regular and Irregular Gallager-zype Error-Correcting Codes

Kabashima, Yoshiyuki, Murayama, Tatsuto, Saad, David, Vicente, Renato

Neural Information Processing SystemsDec-31-2000

The performance of regular and irregular Gallager-type errorcorrecting codeis investigated via methods of statistical physics. The transmitted codeword comprises products of the original message bitsselected by two randomly-constructed sparse matrices; the number of nonzero row/column elements in these matrices constitutes a family of codes. We show that Shannon's channel capacity may be saturated in equilibrium for many of the regular codes while slightly lower performance is obtained for others which may be of higher practical relevance. Decoding aspects are considered byemploying the TAP approach which is identical to the commonly used belief-propagation-based decoding. We show that irregular codes may saturate Shannon's capacity but with improved dynamical properties. 1 Introduction The ever increasing information transmission in the modern world is based on reliably communicatingmessages through noisy transmission channels; these can be telephone lines, deep space, magnetic storing media etc. Error-correcting codes play a significant role in correcting errors incurred during transmission; this is carried out by encoding the message prior to transmission and decoding the corrupted received code-word for retrieving the original message.

artificial intelligence, free energy, initial condition, (15 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.28)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback

Hierarchical Image Probability (H1P) Models

Spence, Clay, Parra, Lucas C.

Neural Information Processing SystemsDec-31-2000

We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into conditional distributionsof feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To fix this, we introduce hiddenclass labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters can be found with maximum likelihoodestimation using the EM algorithm. We have obtained encouraging preliminary results on the problems of detecting various objects inSAR images and target recognition in optical aerial images. 1 Introduction

artificial intelligence, information, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A Geometric Interpretation of v-SVM Classifiers

Crisp, David J., Burges, Christopher J. C.

Neural Information Processing SystemsDec-31-2000

We show that the recently proposed variant of the Support Vector machine (SVM) algorithm, known as v-SVM, can be interpreted as a maximal separation between subsets of the convex hulls of the data, which we call soft convex hulls. The soft convex hulls are controlled by choice of the parameter v. If the intersection of the convex hulls is empty, the hyperplane is positioned halfway between them such that the distance between convex hulls, measured along the normal, is maximized; and if it is not, the hyperplane's normal is similarly determined by the soft convex hulls, but its position (perpendicular distance from the origin) is adjusted to minimize the error sum. The proposed geometric interpretation of v-SVM also leads to necessary and sufficient conditions for the existence of a choice of v for which the v-SVM solution is nontrivial. 1 Introduction Recently, SchOlkopf et al. [I) introduced a new class of SVM algorithms, called v-SVM, for both regression estimation and pattern recognition. The basic idea is to remove the user-chosen error penalty factor C that appears in SVM algorithms by introducing a new variable p which, in the pattern recognition case, adds another degree of freedom to the margin.

artificial intelligence, convex hull, machine learning, (13 more...)

Neural Information Processing Systems

Country: Oceania > Australia > South Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Acquisition in Autoshaping

Kakade, Sham, Dayan, Peter

Neural Information Processing SystemsDec-31-2000

However, most models have simply ignored these data; the few that have attempted toaddress them have failed by at least an order of magnitude. We discuss key data on the speed of acquisition, and show how to account for them using a statistically sound model of learning, in which differential reliabilities of stimuli playa crucial role. 1 Introduction Conditioning experiments probe the ways that animals make predictions about rewards and punishments and how those predictions are used to their advantage. Substantial quantitative data are available as to how pigeons and rats acquire conditioned responsesduring autoshaping, which is one of the simplest paradigms of classical conditioning.

acquisition, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Channel Noise in Excitable Neural Membranes

Manwani, Amit, Steinmetz, Peter N., Koch, Christof

Neural Information Processing SystemsDec-31-2000

Stochastic fluctuations of voltage-gated ion channels generate current and voltage noise in neuronal membranes. This noise may be a critical determinantof the efficacy of information processing within neural systems. Using Monte-Carlo simulations, we carry out a systematic investigation ofthe relationship between channel kinetics and the resulting membrane voltage noise using a stochastic Markov version of the Mainen-Sejnowski model of dendritic excitability in cortical neurons. Our simulations show that kinetic parameters which lead to an increase in membrane excitability (increasing channel densities, decreasing temperature) alsolead to an increase in the magnitude of the sub-threshold voltage noise. Noise also increases as the membrane is depolarized from rest towards threshold. This suggests that channel fluctuations may interfere witha neuron's ability to function as an integrator of its synaptic inputs and may limit the reliability and precision of neural information processing.

artificial intelligence, machine learning, noise, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.15)
North America > United States > Massachusetts (0.15)
North America > United States > California (0.14)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Image Recognition in Context: Application to Microscopic Urinalysis

Song, Xubo B., Sill, Joseph, Abu-Mostafa, Yaser S., Kasdan, Harvey

Neural Information Processing SystemsDec-31-2000

We propose a new and efficient technique for incorporating contextual information into object classification. Most of the current techniques face the problem of exponential computation cost. In this paper, we propose a new general framework that incorporates partial context at a linear cost. This technique is applied to microscopic urinalysis image recognition, resulting in a significant improvement of recognition rate over the context free approach. This gain would have been impossible using conventional context incorporation techniques.

information, machine learning, pattern recognition, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County (0.14)

Industry: Health & Medicine > Diagnostic Medicine (0.74)

Technology: