Goto

Collaborating Authors

 Frey, Brendan J.


Fast, Large-Scale Transformation-Invariant Clustering

Neural Information Processing Systems

In previous work on "transformed mixtures of Gaussians" and "transformed hidden Markov models", we showed how the EM algorithm ina discrete latent variable model can be used to jointly normalize data (e.g., center images, pitch-normalize spectrograms) and learn a mixture model of the normalized data. The only input to the algorithm is the data, a list of possible transformations, and the number of clusters to find. The main criticism of this work was that the exhaustive computation of the posterior probabilities overtransformations would make scaling up to large feature vectors and large sets of transformations intractable. Here, we describe howa tremendous speedup is acheived through the use of a variational technique for decoupling transformations, and a fast Fourier transform method for computing posterior probabilities.


Product Analysis: Learning to Model Observations as Products of Hidden Variables

Neural Information Processing Systems

Factor analysis and principal components analysis can be used to model linear relationships between observed variables and linearly map high-dimensional data to a lower-dimensional hidden space. In factor analysis, the observations are modeled as a linear combination ofnormally distributed hidden variables. We describe a nonlinear generalization of factor analysis, called "product analysis", thatmodels the observed variables as a linear combination of products of normally distributed hidden variables. Just as factor analysiscan be viewed as unsupervised linear regression on unobserved, normally distributed hidden variables, product analysis canbe viewed as unsupervised linear regression on products of unobserved, normally distributed hidden variables. The mapping betweenthe data and the hidden space is nonlinear, so we use an approximate variational technique for inference and learning.


ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

Neural Information Processing Systems

A challenging, unsolved problem in the speech recognition community isrecognizing speech signals that are corrupted by loud, highly nonstationary noise. One approach to noisy speech recognition isto automatically remove the noise from the cepstrum sequence beforefeeding it in to a clean speech recognizer. In previous work published in Eurospeech, we showed how a probability model trained on clean speech and a separate probability model trained on noise could be combined for the purpose of estimating the noisefree speechfrom the noisy speech. We showed how an iterative 2nd order vector Taylor series approximation could be used for probabilistic inferencein this model. In many circumstances, it is not possible to obtain examples of noise without speech.



Accumulator Networks: Suitors of Local Probability Propagation

Neural Information Processing Systems

One way to approximate inference in richly-connected graphical models is to apply the sum-product algorithm (a.k.a. The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probability functions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, we give results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependencies between variables, filling in missing data and making Bayesoptimal decisions using probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992).


Accumulator Networks: Suitors of Local Probability Propagation

Neural Information Processing Systems

The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probabilityfunctions - including the sigmoid function - direct application of the sum-product algorithm is not possible.


Keeping Flexible Active Contours on Track using Metropolis Updates

Neural Information Processing Systems

Condensation, a form of likelihood-weighted particle filtering, has been successfully used to infer the shapes of highly constrained "active" contours in video sequences. However, when the contours are highly flexible (e.g. for tracking fingers of a hand), a computationally burdensome number of particles is needed to successfully approximate the contour distribution. We show how the Metropolis algorithm can be used to update a particle set representing a distribution over contours at each frame in a video sequence. We compare this method to condensation using a video sequence that requires highly flexible contours, and show that the new algorithm performs dramatically better that the condensation algorithm. We discuss the incorporation of this method into the "active contour" framework where a shape-subspace is used constrain shape variation.


Sequentially Fitting ``Inclusive'' Trees for Inference in Noisy-OR Networks

Neural Information Processing Systems

Exact inference in large, richly connected noisy-OR networks is intractable, and most approximate inference algorithms tend to concentrate on a small number of most probable configurations of the hidden variables under the posterior. We presented an "inclusive" variational method for bipartite noisy-OR networks that favors including all probable configurations, at the cost of including some improbable configurations. The method fits a tree to the posterior distribution sequentially, i.e., one observation at a time. Results on an ensemble of QMR-DT type networks show that the method performs better than local probability propagation and a variational upper bound for ranking most probable diseases.


Keeping Flexible Active Contours on Track using Metropolis Updates

Neural Information Processing Systems

Condensation, a form of likelihood-weighted particle filtering, has been successfully used to infer the shapes of highly constrained "active" contours invideo sequences.


Sequentially Fitting ``Inclusive'' Trees for Inference in Noisy-OR Networks

Neural Information Processing Systems

Forexample, in medical diagnosis, the presence of a symptom can be expressed as a noisy-OR of the diseases that may cause the symptom - on some occasions, a disease may fail to activate the symptom. Inference in richly-connected noisy-OR networks is intractable, butapproximate methods (e .g., variational techniques) are showing increasing promise as practical solutions. One problem withmost approximations is that they tend to concentrate on a relatively small number of modes in the true posterior, ignoring otherplausible configurations of the hidden variables.