Goto

Collaborating Authors

 Bayesian Inference



Semi-parametric Exponential Family PCA

Neural Information Processing Systems

We present a semi-parametric latent variable model based technique for density modelling, dimensionality reduction and visualization. Unlike previous methods, we estimate the latent distribution non-parametrically which enables us to model data generated by an underlying low dimensional, multimodaldistribution. In addition, we allow the components of latent variable models to be drawn from the exponential family which makes the method suitable for special data types, for example binary or count data. Simulations on real valued, binary and count data show favorable comparisonto other related schemes both in terms of separating different populations and generalization to unseen samples.



Conditional Random Fields for Object Recognition

Neural Information Processing Systems

We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field(CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceedsby finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e.


Log-concavity Results on Gaussian Process Methods for Supervised and Unsupervised Learning

Neural Information Processing Systems

Log-concavity is an important property in the context of optimization, Laplace approximation, and sampling; Bayesian methods based on Gaussian processpriors have become quite popular recently for classification, regression, density estimation, and point process intensity estimation. Here we prove that the predictive densities corresponding to each of these applications are log-concave, given any observed data. We also prove that the likelihood is log-concave in the hyperparameters controlling the mean function of the Gaussian prior in the density and point process intensity estimationcases, and the mean, covariance, and observation noise parameters in the classification and regression cases; this result leads to a useful parameterization of these hyperparameters, indicating a suitably large class of priors for which the corresponding maximum a posteriori problem is log-concave. Introduction Bayesian methods based on Gaussian process priors have recently become quite popular for machine learning tasks (1). These techniques have enjoyed a good deal of theoretical examination, documenting their learning-theoretic (generalization) properties (2), and developing avariety of efficient computational schemes (e.g., (3-5), and references therein).


Semi-supervised Learning with Penalized Probabilistic Clustering

Neural Information Processing Systems

While clustering is usually an unsupervised operation, there are circumstances inwhich we believe (with varying degrees of certainty) that items A and B should be assigned to the same cluster, while items A and C should not. We would like such pairwise relations to influence cluster assignments of out-of-sample data in a manner consistent with the prior knowledge expressed in the training set. Our starting point is probabilistic clusteringbased on Gaussian mixture models (GMM) of the data distribution. We express clustering preferences in the prior distribution over assignments of data points to clusters. This prior penalizes cluster assignments according to the degree with which they violate the preferences.


Maximum Likelihood Estimation of Intrinsic Dimension

Neural Information Processing Systems

We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically andby simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performance compared with two other intrinsic dimension estimators.


Unsupervised Variational Bayesian Learning of Nonlinear Models

Neural Information Processing Systems

In this paper we present a framework for using multi-layer perceptron (MLP)networks in nonlinear generative models trained by variational Bayesian learning. The nonlinearity is handled by linearizing it using a Gauss-Hermite quadrature at the hidden neurons. Thisyields an accurate approximation for cases of large posterior variance.The method can be used to derive nonlinear counterparts forlinear algorithms such as factor analysis, independent component/factor analysis and state-space models. This is demonstrated witha nonlinear factor analysis experiment in which even 20 sources can be estimated from a real world speech data set.


Semi-supervised Learning by Entropy Minimization

Neural Information Processing Systems

We consider the semi-supervised learning problem, where a decision rule is to be learned from labeled and unlabeled data. In this framework, we motivate minimum entropy regularization, which enables to incorporate unlabeled data in the standard supervised learning. Our approach includes otherapproaches to the semi-supervised problem as particular or limiting cases. A series of experiments illustrates that the proposed solution benefitsfrom unlabeled data. The method challenges mixture models when the data are sampled from the distribution class spanned by the generative model. The performances are definitely in favor of minimum entropy regularization when generative models are misspecified, and the weighting of unlabeled data provides robustness to the violation of the "cluster assumption". Finally, we also illustrate that the method can also be far superior to manifold learning in high dimension spaces.


Bayesian inference in spiking neurons

Neural Information Processing Systems

We propose a new interpretation of spiking neurons as Bayesian integrators accumulatingevidence over time about events in the external world or the body, and communicating to other neurons their certainties about these events. In this model, spikes signal the occurrence of new information, i.e.what cannot be predicted from the past activity. As a result, firing statistics are close to Poisson, albeit providing a deterministic representation ofprobabilities. We proceed to develop a theory of Bayesian inference in spiking neural networks, recurrent interactions implementing avariant of belief propagation. Many perceptual and motor tasks performed by the central nervous system are probabilistic, andcan be described in a Bayesian framework [4, 3].