Goto

Collaborating Authors

 Directed Networks


Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models

Neural Information Processing Systems

Two well known classes of unsupervised procedures that can be cast in this manner are generative and recoding models. In a generative unsupervised framework, the environment generates training exampleswhich we will refer to as observations-by sampling from one distribution; the other distribution is embodied in the model. Examples of generative frameworks are mixtures of Gaussians (MoG) [2], factor analysis [4], and Boltzmann machines [8]. In the recoding unsupervised framework, the model transforms points from an obser- vation space to an output space, and the output distribution is compared either to a reference distribution or to a distribution derived from the output distribution.


Accumulator Networks: Suitors of Local Probability Propagation

Neural Information Processing Systems

The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probabilityfunctions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, wegive results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependenciesbetween variables, filling in missing data and making Bayesoptimal decisionsusing probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992). Large, richly-connected networks with many cycles can potentially beused to model complex sources of data, such as audio signals, images and video. However, when the number of cycles in the network is large (more precisely, when the cut set size is exponential), exact inference becomes intractable. Also, to learn a probability model with hidden variables, we need to fill in the missing data using probabilistic inference, so learning also becomes intractable. To cope with the intractability of exact inference, a variety of approximate inference methods have been invented, including Monte Carlo (Hinton and Sejnowski 1986; Neal 1992), Helmholz machines (Dayan et al. 1995; Hinton et al. 1995), and variational techniques (Jordan et al. 1998).


Propagation Algorithms for Variational Bayesian Learning

Neural Information Processing Systems

Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical resultsfor the variational updates in a very general family of conjugate-exponential graphical models. We show how the belief propagation and the junction tree algorithms can be used in the inference step of variational Bayesian learning. Applying these results tothe Bayesian analysis of linear-Gaussian state-space models we obtain a learning procedure that exploits the Kalman smoothing propagation,while integrating over all model parameters. We demonstrate how this can be used to infer the hidden state dimensionality ofthe state-space model in a variety of synthetic problems and one real high-dimensional data set. 1 Introduction Bayesian approaches to machine learning have several desirable properties. Bayesian integration does not suffer overfitting (since nothing is fit to the data). Prior knowledge canbe incorporated naturally and all uncertainty is manipulated in a consistent manner. Moreover it is possible to learn model structures and readily compare between model classes. Unfortunately, for most models of interest a full Bayesian analysis is computationally intractable.


Automatic Choice of Dimensionality for PCA

Neural Information Processing Systems

A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, we show how to use Bayesian model selection to estimate thetrue dimensionality of the data. The resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data. The estimate involves an integral over the Steifel manifold of k-frames, which is difficult to compute exactly. But after choosing an appropriate parameterization and applying Laplace's method, an accurate andpractical estimator is obtained. In simulations, it is convincingly better than cross-validation and other proposed algorithms, plus it runs much faster.


APRICODD: Approximate Policy Construction Using Decision Diagrams

Neural Information Processing Systems

We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and policies with much lower time and space requirements than exact dynamic programming. Our method reduces the sizes of the intermediate value functions generated during value iteration by replacing the values at the terminals of the ADD with ranges of values. Our method is demonstrated on a class of large MDPs (with up to 34 billion states), and we compare the results with the optimal value functions.


Bayesian Video Shot Segmentation

Neural Information Processing Systems

Prior knowledge about video structure can be used both as a means to improve the peiformance of content analysis and to extract features that allow semantic classification. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models by introducing a Bayesian formulation for the shot segmentation problem. The new formulations is shown to extend standard thresholding methods in an adaptive and intuitive way, leading to improved segmentation accuracy.


Bayes Networks on Ice: Robotic Search for Antarctic Meteorites

Neural Information Processing Systems

A Bayes network based classifier for distinguishing terrestrial rocks from meteorites is implemented onboard the Nomad robot. Equipped with a camera, spectrometer and eddy current sensor, this robot searched the ice sheets of Antarctica and autonomously made the first robotic identification of a meteorite, in January 2000 at the Elephant Moraine. This paper discusses rock classification from a robotic platform, and describes the system onboard Nomad. 1 Introduction Figure 1: Human meteorite search with snowmobiles on the Antarctic ice sheets, and on foot in the moraines. Antarctica contains the most fertile meteorite hunting grounds on Earth. The pristine, dry and cold environment ensures that meteorites deposited there are preserved for long periods.


Learning Switching Linear Models of Human Motion

Neural Information Processing Systems

The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. Effective models of human dynamics can be learned from motion capture data using switching linear dynamic system (SLDS) models. We present results for human motion synthesis, classification,and visual tracking using learned SLDS models. Since exact inference in SLDS is intractable, we present three approximate inference algorithmsand compare their performance. In particular, a new variational inference algorithm is obtained by casting the SLDS model as a Dynamic Bayesian Network. Classification experiments show the superiority of SLDS over conventional HMM's for our problem domain.


Learning and Tracking Cyclic Human Motion

Neural Information Processing Systems

We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into "cycles". Then the mean and the principal componentsof the cycles are computed using a new algorithm that accounts for missing information and enforces smooth transitions betweencycles. The learned temporal model provides a prior probability distribution over human motions that can be used in a Bayesian framework for tracking human subjects in complex monocular video sequences and recovering their 3D motion. 1 Introduction The modeling and tracking of human motion in video is important for problems as varied as animation, video database search, sports medicine, and human-computer interaction. Technically, the human body can be approximated by a collection of articulated limbs and its motion can be thought of as a collection of time-series describing the joint angles as they evolve over time. A key challenge in modeling these joint angles involves decomposing the time-series into suitable temporal primitives.


Feature Correspondence: A Markov Chain Monte Carlo Approach

Neural Information Processing Systems

When trying to recover 3D structure from a set of images, the most difficult problem is establishing the correspondence between the measurements. Most existing approaches assume that features can be tracked across frames, whereas methods that exploit rigidity constraints to facilitate matching do so only under restricted camera motion.In this paper we propose a Bayesian approach that avoids the brittleness associated with singling out one "best" correspondence, andinstead consider the distribution over all possible correspondences. We treat both a fully Bayesian approach that yields a posterior distribution, and a MAP approach that makes use of EM to maximize this posterior. We show how Markov chain Monte Carlo methods can be used to implement these techniques in practice, and present experimental results on real data.