Bayesian Inference
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation
Latent Dirichlet allocation (LDA) is a Bayesian network that has recently gained much popularity in applications ranging from document modeling to computer vision. Due to the large scale nature of these applications, current inference pro- cedures like variational Bayes and Gibbs sampling have been found lacking. In this paper we propose the collapsed variational Bayesian inference algorithm for LDA, and show that it is computationally efficient, easy to implement and signifi- cantly more accurate than standard variational Bayesian inference for LDA.
Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
The study of point cloud data sampled from a stratification, a collection of manifolds with possible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality and density of such structures. The framework is based on a maximum likelihood estimation of a Poisson mixture model. The presentation of the approach is completed with artificial and real examples demonstrating the importance of extending manifold learning to stratification learning.
Bayesian Model Scoring in Markov Random Fields
Scoring structures of undirected graphical models by means of evaluating the marginal likelihood is very hard. The main reason is the presence of the parti- tion function which is intractable to evaluate, let alone integrate over. We propose to approximate the marginal likelihood by employing two levels of approximation: we assume normality of the posterior (the Laplace approximation) and approxi- mate all remaining intractable quantities using belief propagation and the linear response approximation. Em- pirically, we find that our procedure has about two orders of magnitude better accuracy than standard BIC methods for small datasets, but deteriorates when the size of the dataset grows.
A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments
The additive clustering model is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. This paper develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features used in producing similarity judgments and their importance.
The rat as particle filter
Although theorists have interpreted classical conditioning as a laboratory model of Bayesian belief updating, a recent reanalysis showed that the key features that theoretical models capture about learning are artifacts of averaging over subjects. Rather than learning smoothly to asymptote (reflecting, according to Bayesian models, the gradual tradeoff from prior to posterior as data accumulate), subjects learn suddenly and their predictions fluctuate perpetually. We suggest that abrupt and unstable learning can be modeled by assuming subjects are conducting in- ference using sequential Monte Carlo sampling with a small number of samples -- one, in our simulations. Ensemble behavior resembles exact Bayesian models since, as in particle filters, it averages over many samples. Further, the model is capable of exhibiting sophisticated behaviors like retrospective revaluation at the ensemble level, even given minimally sophisticated individuals that do not track uncertainty in their beliefs over trials.
Infinite State Bayes-Nets for Structured Domains
A general modeling framework is proposed that unifies nonparametric-Bayesian models, topic-models and Bayesian networks. This class of infinite state Bayes nets (ISBN) can be viewed as directed networks of'hierarchical Dirichlet processes' (HDPs) where the domain of the variables can be structured (e.g. Existing models, such as nested-DP, Pachinko allocation, mixed membership sto- chastic block models as well as a number of new models are described as ISBNs. Two experiments have been performed to illustrate these ideas.
A Bayesian Framework for Cross-Situational Word-Learning
For infants, early word learning is a chicken-and-egg problem. One way to learn a word is to observe that it co-occurs with a particular referent across different situations. Another way is to use the social context of an utterance to infer the in- tended referent of a word. Here we present a Bayesian model of cross-situational word learning, and an extension of this model that also learns which social cues are relevant to determining reference. We test our model on a small corpus of mother-infant interaction and find it performs better than competing models. Fi- nally, we show that our model accounts for experimental phenomena including mutual exclusivity, fast-mapping, and generalization from social cues.
Density Estimation under Independent Similarly Distributed Sampling Assumptions
A method is proposed for semiparametric estimation where parametric and non- parametric criteria are exploited in density estimation and unsupervised learning. This is accomplished by making sampling assumptions on a dataset that smoothly interpolate between the extreme of independently distributed (or id) sample data (as in nonparametric kernel density estimators) to the extreme of independent identically distributed (or iid) sample data. This article makes independent simi- larly distributed (or isd) sampling assumptions and interpolates between these two using a scalar parameter. The parameter controls a Bhattacharyya affinity penalty between pairs of distributions on samples. Surprisingly, the isd method maintains certain consistency and unimodality properties akin to maximum likelihood esti- mation.
Comparing Bayesian models for multisensory cue combination without mandatory integration
Bayesian models of multisensory perception traditionally address the problem of estimating an underlying variable that is assumed to be the cause of the two sen- sory signals. The brain, however, has to solve a more general problem: it also has to establish which signals come from the same source and should be integrated, and which ones do not and should be segregated. In the last couple of years, a few models have been proposed to solve this problem in a Bayesian fashion. One of these has the strength that it formalizes the causal structure of sensory signals. We first compare these models on a formal level.
Optimal models of sound localization by barn owls
Sound localization by barn owls is commonly modeled as a matching procedure where localization cues derived from auditory inputs are compared to stored templates. While the matching models can explain properties of neural responses, no model explains how the owl resolves spatial ambiguity in the localization cues to produce accurate localization near the center of gaze. Here, we examine two models for the barn owl's sound localization behavior. First, we consider a maximum likelihood estimator in order to further evaluate the cue matching model. Second, we consider a maximum a posteriori estimator to test if a Bayesian model with a prior that emphasizes directions near the center of gaze can reproduce the owl's localization behavior.