Bayesian Inference
Learning Gaussian Process Kernels via Hierarchical Bayes
Schwaighofer, Anton, Tresp, Volker, Yu, Kai
We present a novel method for learning with Gaussian process regression ina hierarchical Bayesian framework. In a first step, kernel matrices on a fixed set of input points are learned from data using a simple and efficient EM algorithm. This step is nonparametric, in that it does not require a parametric form of covariance function. In a second step, kernel functions are fitted to approximate the learned covariance matrix using a generalized Nystrรถm method, which results in a complex, data driven kernel. We evaluate our approach as a recommendation engine for art images, where the proposed hierarchical Bayesian method leads to excellent prediction performance.
PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data
We propose a "soft greedy" learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a nontrivial tradeoff between sparsity (the number of rays used) and the magnitude ofthe separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets.
A Probabilistic Model for Online Document Clustering with Application to Novelty Detection
Zhang, Jian, Ghahramani, Zoubin, Yang, Yiming
In this paper we propose a probabilistic model for online document clustering. Weuse nonparametric Dirichlet process prior to model the growing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichletmultinomial distribution.We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature.
Generative Affine Localisation and Tracking
We present an extension to the Jojic and Frey (2001) layered sprite model which allows for layers to undergo affine transformations. This extension allows for affine object pose to be inferred whilst simultaneously learning theobject shape and appearance. Learning is carried out by applying an augmented variational inference algorithm which includes a global search over a discretised transform space followed by a local optimisation. Toaid correct convergence, we use bottom-up cues to restrict the space of possible affine transformations. We present results on a number of video sequences and show how the model can be extended to track an object whose appearance changes throughout the sequence.
Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes
Teh, Yee W., Jordan, Michael I., Beal, Matthew J., Blei, David M.
We propose the hierarchical Dirichlet process (HDP), a nonparametric Bayesian model for clustering problems involving multiple groups of data. Each group of data is modeled with a mixture, with the number of components being open-ended and inferred automatically by the model. Further, components can be shared across groups, allowing dependencies across groups to be modeled effectively as well as conferring generalization tonew groups. Such grouped clustering problems occur often in practice, e.g. in the problem of topic discovery in document corpora. We report experimental results on three text corpora showing the effective and superior performance of the HDP over previous models.
Constraining a Bayesian Model of Human Visual Speed Perception
Stocker, Alan A., Simoncelli, Eero P.
It has been demonstrated that basic aspects of human visual motion perception arequalitatively consistent with a Bayesian estimation framework, where the prior probability distribution on velocity favors slow speeds. Here, we present a refined probabilistic model that can account for the typical trial-to-trial variabilities observed in psychophysical speed perception experiments. We also show that data from such experiments can be used to constrain both the likelihood and prior functions of the model. Specifically, we measured matching speeds and thresholds in a two-alternative forced choice speed discrimination task. Parametric fits to the data reveal that the likelihood function is well approximated by a LogNormal distribution with a characteristic contrast-dependent variance, andthat the prior distribution on velocity exhibits significantly heavier tails than a Gaussian, and approximately follows a power-law function.