AITopics

This paper presents probabilistic modeling methods to solve the problem of discriminating betweenfive facial orientations with very little labeled data.

artificial intelligence, dependency, machine learning, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sharma, Ravi K., Leen, Todd K., Pavel, Misha

Probabilistic Image Sensor Fusion

We present a probabilistic method for fusion of images produced by multiple sensors. The approach is based on an image formation model in which the sensor images are noisy, locally linear functions of an underlying, true scene. A Bayesian framework then provides for maximum likelihood or maximum a posteriori estimates of the true scene from the sensor images. Maximum likelihood estimates of the parameters of the image formation model involve (local) second order image statistics, and thus are related to local principal component analysis. We demonstrate the efficacy of the method on images from visible-band and infrared sensors. 1 Introduction Advances in sensing devices have fueled the deployment of multiple sensors in several computational vision systems [1, for example]. Using multiple sensors can increase reliability with respect to single sensor systems.

artificial intelligence, machine learning, sensor image, (17 more...)

Country: North America > United States > Oregon (0.14)

Industry:

Government (0.47)
Semiconductors & Electronics (0.41)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Freeman, William T., Pasztor, Egon C.

Learning to Estimate Scenes from Images

We seek the scene interpretation that best explains image data. For example, we may want to infer the projected velocities (scene) which best explain two consecutive image frames (image).

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Coughlan, James M., Yuille, Alan L.

A Phase Space Approach to Minimax Entropy Learning and the Minutemax Approximations

There has been much recent work on measuring image statistics and on learning probability distributions on images. We observe that the mapping from images to statistics is many-to-one and show it can be quantified by a phase space factor. This phase space approach throws light on the Minimax Entropy technique for learning Gibbs distributions on images with potentials derived from image statistics and elucidates the ambiguities that are inherent to determining the potentials. In addition, it shows that if the phase factor can be approximated by an analytic distribution then this approximation yields a swift "Minutemax" algorithm that vastly reduces the computation time for Minimax entropy learning. An illustration of this concept, using a Gaussian to approximate the phase factor, gives a good approximation to the results of Zhu and Mumford (1997) in just seconds of CPU time. The phase space approach also gives insight into the multi-scale potentials found by Zhu and Mumford (1997) and suggests that the forms of the potentials are influenced greatly by phase space considerations. Finally, we prove that probability distributions learned in feature space alone are equivalent to Minimax Entropy learning with a multinomial approximation of the phase factor. 1 Introduction Bayesian probability theory gives a powerful framework for visual perception (Knill and Richards 1996). This approach, however, requires specifying prior probabilities and likelihood functions. Learning these probabilities is difficult because it requires estimating distributions on random variables of very high dimensions (for example, images with 200 x 200 pixels, or shape curves of length 400 pixels).

approximation, artificial intelligence, machine learning, (15 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Saul, Lawrence K., Rahim, Mazin G.

Markov Processes on Curves for Automatic Speech Recognition

To formulate a probabilistic model of this process, we consider two variables-one continuous (x), one discrete (s)-that evolve jointly in time. Thus the vector x traces out a smooth multidimensional curve, to each point of which the variable s attaches a discrete label. Markov processes on curves are based on the concept of arc length. After reviewing how to compute arc lengths along curves, we introduce a family of Markov processes whose predictions are invariant to nonlinear warpings of time. We then consider the ways in which these processes (and various generalizations) differ from HMMs. Markov Processes on Curves for Automatic Speech Recognition 753 2.1 Arc length Let g() define a D x D matrix-valued function over x E RP. If g() is everywhere nonnegative definite, then we can use it as a metric to compute distances along curves.

arc length, artificial intelligence, machine learning, (16 more...)

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Nix, David A., Hogden, John E.

Maximum-Likelihood Continuity Mapping (MALCOM): An Alternative to HMMs

We describe Maximum-Likelihood Continuity Mapping (MALCOM), an alternative to hidden Markov models (HMMs) for processing sequence data such as speech. While HMMs have a discrete "hidden" space constrained bya fixed finite-automaton architecture, MALCOM has a continuous hidden space-a continuity map-that is constrained only by a smoothness requirement on paths through the space. MALCOM fits into the same probabilistic framework for speech recognition as HMMs, but it represents a more realistic model of the speech production process. To evaluate the extent to which MALCOM captures speech production information, we generated continuous speech continuity maps for three speakers and used the paths through them to predict measured speech articulator data. The median correlation between the MALCOM paths obtained from only the speech acoustics and articulator measurements was 0.77 on an independent test set not used to train MALCOM or the predictor.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Country:

North America > United States (0.70)
Europe > United Kingdom > England (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Neukirchen, Christoph, Rigoll, Gerhard

Controlling the Complexity of HMM Systems by Regularization

This paper introduces a method for regularization ofHMM systems that avoids parameter overfitting caused by insufficient training data. Regularization isdone by augmenting the EM training method by a penalty term that favors simple and smooth HMM systems. The penalty term is constructed as a mixture model of negative exponential distributions that is assumed to generate the state dependent emission probabilities of the HMMs. This new method is the successful transfer of a well known regularization approach in neural networks to the HMM domain and can be interpreted as a generalization of traditional state-tying for HMM systems. Theeffect of regularization is demonstrated for continuous speech recognition tasks by improving overfitted triphone models and by speaker adaptation with limited training data. 1 Introduction One general problem when constructing statistical pattern recognition systems is to ensure the capability to generalize well, i.e. the system must be able to classify data that is not contained in the training data set.

artificial intelligence, machine learning, pattern recognition, (19 more...)

Country: North America > United States (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.54)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Probabilistic Visualisation of High-Dimensional Binary Data

Tipping, Michael E.

We present a probabilistic latent-variable framework for data visualisation, akey feature of which is its applicability to binary and categorical data types for which few established methods exist. A variational approximation to the likelihood is exploited to derive a fast algorithm for determining the model parameters. Illustrations of application to real and synthetic binary data sets are given.

approximation, artificial intelligence, machine learning, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Marrs, Alan D., Webb, Andrew R.

Exploratory Data Analysis Using Radial Basis Function Latent Variable Models

Two developments of nonlinear latent variable models based on radial basis functions are discussed: in the first, the use of priors or constraints on allowable models is considered as a means of preserving data structure in low-dimensional representations for visualisation purposes. Also, a resampling approach is introduced which makes more effective use of the latent samples in evaluating the likelihood.

artificial intelligence, latent sample, machine learning, (15 more...)

Country: Europe > United Kingdom (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.87)

Lee, Daniel D., Sompolinsky, Haim

Learning a Continuous Hidden Variable Model for Binary Data

A directed generative model for binary data using a small number of hidden continuous units is investigated. The relationships between the correlations of the underlying continuousGaussian variables and the binary output variables are utilized to learn the appropriate weights of the network. The advantages of this approach are illustrated on a translationally invariant binarydistribution and on handwritten digit images. Introduction Principal Components Analysis (PCA) is a widely used statistical technique for representing datawith a large number of variables [1]. It is based upon the assumption that although the data is embedded in a high dimensional vector space, most of the variability in the data is captured by a much lower climensional manifold.

artificial intelligence, eigenvalue, machine learning, (13 more...)

Country:

Asia > Middle East > Israel (0.15)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)