Goto

Collaborating Authors

 Bayesian Learning


Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization

Neural Information Processing Systems

The project pursued in this paper is to develop from first information-geometric principles a general method for learning the similarity between text documents. Each individual document is modeled as a memoryless information source. Based on a latent class decomposition of the term-document matrix, a lowdimensional (curved) multinomial subfamily is learned. From this model a canonical similarity function - known as the Fisher kernel - is derived. Our approach can be applied for unsupervised and supervised learning problems alike.


Hierarchical Image Probability (H1P) Models

Neural Information Processing Systems

We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into conditional distributions of feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To fix this, we introduce hidden class labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters can be found with maximum likelihood estimation using the EM algorithm. We have obtained encouraging preliminary results on the problems of detecting various objects in SAR images and target recognition in optical aerial images. 1 Introduction


Bayesian Reconstruction of 3D Human Motion from Single-Camera Video

Neural Information Processing Systems

The three-dimensional motion of humans is underdetermined when the observation is limited to a single camera, due to the inherent 3D ambiguity of 2D video. We present a system that reconstructs the 3D motion of human subjects from single-camera video, relying on prior knowledge about human motion, learned from training data, to resolve those ambiguities. After initialization in 2D, the tracking and 3D reconstruction is automatic; we show results for several video sequences. The results show the power of treating 3D body tracking as an inference problem.


Bayesian Modelling of fMRI lime Series

Neural Information Processing Systems

We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial tMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte Carlo (MCMC) sampling techniques. The advantage of this method is that detection of short time learning effects between repeated trials is possible since inference is based only on single trial experiments.


Manifold Stochastic Dynamics for Bayesian Learning

Neural Information Processing Systems

We propose a new Markov Chain Monte Carlo algorithm which is a generalization of the stochastic dynamics method. The algorithm performs exploration of the state space using its intrinsic geometric structure, facilitating efficient sampling of complex distributions. Applied to Bayesian learning in neural networks, our algorithm was found to perform at least as well as the best state-of-the-art method while consuming considerably less time. 1 Introduction


The Relevance Vector Machine

Neural Information Processing Systems

The support vector machine (SVM) is a state-of-the-art technique for regression and classification, combining excellent generalisation properties with a sparse kernel representation. However, it does suffer from a number of disadvantages, notably the absence of probabilistic outputs, the requirement to estimate a tradeoff parameter and the need to utilise'Mercer' kernel functions. In this paper we introduce the Relevance Vector Machine (RVM), a Bayesian treatment of a generalised linear model of identical functional form to the SVM. The RVM suffers from none of the above disadvantages, and examples demonstrate that for comparable generalisation performance, the RVM requires dramatically fewer kernel functions.


On Input Selection with Reversible Jump Markov Chain Monte Carlo Sampling

Neural Information Processing Systems

In this paper we will treat input selection for a radial basis function (RBF) like classifier within a Bayesian framework. We approximate the a-posteriori distribution over both model coefficients and input subsets by samples drawn with Gibbs updates and reversible jump moves. Using some public datasets, we compare the classification accuracy of the method with a conventional ARD scheme. These datasets are also used to infer the a-posteriori probabilities of different input subsets. 1 Introduction Methods that aim to determine relevance of inputs have always interested researchers in various communities. Classical feature subset selection techniques, as reviewed in [1], use search algorithms and evaluation criteria to determine one optimal subset.


Predictive App roaches for Choosing Hyperparameters in Gaussian Processes

Neural Information Processing Systems

Gaussian Processes are powerful regression models specified by parametrized mean and covariance functions. Standard approaches to estimate these parameters (known by the name Hyperparameters) are Maximum Likelihood (ML) and Maximum APosterior (MAP) approaches. In this paper, we propose and investigate predictive approaches, namely, maximization of Geisser's Surrogate Predictive Probability (GPP) and minimization of mean square error with respect to GPP (referred to as Geisser's Predictive mean square Error (GPE)) to estimate the hyperparameters. We also derive results for the standard Cross-Validation (CV) error and make a comparison. These approaches are tested on a number of problems and experimental results show that these approaches are strongly competitive to existing approaches. 1 Introduction Gaussian Processes (GPs) are powerful regression models that have gained popularity recently, though they have appeared in different forms in the literature for years.


Bayesian Model Selection for Support Vector Machines, Gaussian Processes and Other Kernel Classifiers

Neural Information Processing Systems

We present a variational Bayesian method for model selection over families of kernels classifiers like Support Vector machines or Gaussian processes. The algorithm needs no user interaction and is able to adapt a large number of kernel parameters to given data without having to sacrifice training cases for validation. This opens the possibility to use sophisticated families of kernels in situations where the small "standard kernel" classes are clearly inappropriate. We relate the method to other work done on Gaussian processes and clarify the relation between Support Vector machines and certain Gaussian process models.


Greedy Importance Sampling

Neural Information Processing Systems

I present a simple variation of importance sampling that explicitly searches for important regions in the target distribution. I prove that the technique yields unbiased estimates, and show empirically it can reduce the variance of standard Monte Carlo estimators. This is achieved by concentrating samples in more significant regions of the sample space. 1 Introduction It is well known that general inference and learning with graphical models is computationally hard [1] and it is therefore necessary to consider restricted architectures [13], or approximate algorithms to perform these tasks [3, 7]. Among the most convenient and successful techniques are stochastic methods which are guaranteed to converge to a correct solution in the limit oflarge samples [10, 11, 12, 15]. These methods can be easily applied to complex inference problems that overwhelm deterministic approaches.