Goto

Collaborating Authors

 Bayesian Inference


Bayesian Map Learning in Dynamic Environments

Neural Information Processing Systems

We consider the problem of learning a grid-based map using a robot with noisy sensors and actuators. We compare two approaches: online EM, where the map is treated as a fixed parameter, and Bayesian inference, where the map is a (matrix-valued) random variable. We show that even on a very simple example, online EM can get stuck in local minima, which causes the robot to get "lost" and the resulting map to be useless. By contrast, the Bayesian approach, by maintaining multiple hypotheses, is much more ro(cid:173) bust. We then introduce a method for approximating the Bayesian solution, called Rao-Blackwellised particle filtering.


Bayesian Network Induction via Local Neighborhoods

Neural Information Processing Systems

In recent years, Bayesian networks have become highly successful tool for di(cid:173) agnosis, analysis, and decision making in real-world domains. We present an efficient algorithm for learning Bayes networks from data. In contrast to the majority of work, which typically uses hill-climbing approaches that may produce dense and causally incorrect nets, our approach yields much more compact causal networks by heeding independencies in the data. Compact causal networks facilitate fast in(cid:173) ference and are also easier to understand. We prove that under mild assumptions, our approach requires time polynomial in the size of the data and the number of nodes.


Bayesian Averaging is Well-Temperated

Neural Information Processing Systems

Bayesian predictions are stochastic just like predictions of any other inference scheme that generalize from a finite sample. While a sim(cid:173) ple variational argument shows that Bayes averaging is generaliza(cid:173) tion optimal given that the prior matches the teacher parameter distribution the situation is less clear if the teacher distribution is unknown. I define a class of averaging procedures, the temperated likelihoods, including both Bayes averaging with a uniform prior and maximum likelihood estimation as special cases. I show that Bayes is generalization optimal in this family for any teacher dis(cid:173) tribution for two learning problems that are analytically tractable: learning the mean of a Gaussian and asymptotics of smooth learn(cid:173) ers.


Learning and Tracking Cyclic Human Motion

Neural Information Processing Systems

We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into "cycles". Then the mean and the princi(cid:173) pal components of the cycles are computed using a new algorithm that accounts for missing information and enforces smooth tran(cid:173) sitions between cycles. The learned temporal model provides a prior probability distribution over human motions that can be used in a Bayesian framework for tracking human subjects in complex monocular video sequences and recovering their 3D motion.


The Manhattan World Assumption: Regularities in Scene Statistics which Enable Bayesian Inference

Neural Information Processing Systems

Preliminary work by the authors made use of the so-called "Man(cid:173) hattan world" assumption about the scene statistics of city and indoor scenes. This assumption stated that such scenes were built on a cartesian grid which led to regularities in the image edge gra(cid:173) dient statistics. In this paper we explore the general applicability of this assumption and show that, surprisingly, it holds in a large variety of less structured environments including rural scenes. This enables us, from a single image, to determine the orientation of the viewer relative to the scene structure and also to detect target ob(cid:173) jects which are not aligned with the grid. These inferences are performed using a Bayesian model with probability distributions (e.g. on the image gradient statistics) learnt from real data.


Propagation Algorithms for Variational Bayesian Learning

Neural Information Processing Systems

Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoret(cid:173) ical results for the variational updates in a very general family of conjugate-exponential graphical models. We show how the belief propagation and the junction tree algorithms can be used in the inference step of variational Bayesian learning. Applying these re(cid:173) sults to the Bayesian analysis of linear-Gaussian state-space models we obtain a learning procedure that exploits the Kalman smooth(cid:173) ing propagation, while integrating over all model parameters. We demonstrate how this can be used to infer the hidden state dimen(cid:173) sionality of the state-space model in a variety of synthetic problems and one real high-dimensional data set.


Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models

Neural Information Processing Systems

The goal of many unsupervised learning procedures is to bring two probability distributions into alignment. Generative models such as Gaussian mixtures and Boltzmann machines can be cast in this light, as can recoding models such as ICA and projection pursuit. We propose a novel sample-based error measure for these classes of models, which applies even in situations where maximum likelihood (ML) and probability density estimation-based formulations can(cid:173) not be applied, e.g., models that are nonlinear or have intractable posteriors. Furthermore, our sample-based error measure avoids the difficulties of approximating a density function. We prove that with an unconstrained model, (1) our approach converges on the correct solution as the number of samples goes to infinity, and (2) the expected solution of our approach in the generative framework is the ML solution.


On Reversing Jensen's Inequality

Neural Information Processing Systems

Jensen's inequality is a powerful mathematical tool and one of the workhorses in statistical learning. Its applications therein include the EM algorithm, Bayesian estimation and Bayesian inference. Quite often (i.e. in discriminative learning) upper bounds are needed as well. We derive and prove an efficient analytic inequality that provides such variational upper bounds. This inequality holds for latent variable mixtures of exponential family distributions and thus spans a wide range of contemporary statis(cid:173) tical models.


Active Learning for Parameter Estimation in Bayesian Networks

Neural Information Processing Systems

Bayesian networks are graphical representations of probability distributions. In virtually all of the work on learning these networks, the assumption is that we are presented with a data set consisting of randomly generated instances from the underlying distribution. In many situations, however, we also have the option of active learning, where we have the possibility of guiding the sampling process by querying for certain types of samples. This paper addresses the problem of estimating the parameters of Bayesian networks in an active learning setting. We provide a theoretical framework for this problem, and an algorithm that chooses which active learning queries to generate based on the model learned so far.


A Maximum-Likelihood Approach to Modeling Multisensory Enhancement

Neural Information Processing Systems

Multisensory response enhancement (MRE) is the augmentation of the response of a neuron to sensory input of one modality by si(cid:173) multaneous input from another modality. The maximum likelihood (ML) model presented here modifies the Bayesian model for MRE (Anastasio et al.) by incorporating a decision strategy to maximize the number of correct decisions. Thus the ML model can also deal with the important tasks of stimulus discrimination and identifi(cid:173) cation in the presence of incongruent visual and auditory cues. It accounts for the inverse effectiveness observed in neurophysiolog(cid:173) ical recording data, and it predicts a functional relation between uni- and bimodal levels of discriminability that is testable both in neurophysiological and behavioral experiments.