Goto

Collaborating Authors

 Bayesian Learning



Exponential Family Harmoniums with an Application to Information Retrieval

Neural Information Processing Systems

Directed graphical models with one layer of observed random variables and one or more layers of hidden random variables have been the dominant modelling paradigm in many research fields. Although this approach has met with considerable success, the causal semantics of these models can make it difficult to infer the posterior distribution over the hidden variables. In this paper we propose an alternative two-layer model based on exponential family distributions and the semantics of undirected models. Inference in these "exponential family harmoniums" is fast while learning is performed by minimizing contrastive divergence. A member of this family is then studied as an alternative probabilistic model for latent semantic indexing. In experiments it is shown that they perform well on document retrieval tasks and provide an elegant solution to searching with keywords.


Identifying Protein-Protein Interaction Sites on a Genome-Wide Scale

Neural Information Processing Systems

Protein interactions typically arise from a physical interaction of one or more small sites on the surface of the two proteins. Identifying these sites is very important for drug and protein design. In this paper, we propose a computational method based on probabilistic relational model that attempts to address this task using high-throughput protein interaction data and a set of short sequence motifs. We learn the model using the EM algorithm, with a branch-and-bound algorithm as an approximate inference for the E-step. Our method searches for motifs whose presence in a pair of interacting proteins can explain their observed interaction. It also tries to determine which motif pairs have high affinity, and can therefore lead to an interaction. We show that our method is more accurate than others at predicting new protein-protein interactions. More importantly, by examining solved structures of protein complexes, we find that 2/3 of the predicted active motifs correspond to actual interaction sites.


Instance-Specific Bayesian Model Averaging for Classification

Neural Information Processing Systems

Classification algorithms typically induce population-wide models that are trained to perform well on average on expected future instances. We introduce a Bayesian framework for learning instance-specific models from data that are optimized to predict well for a particular instance. Based on this framework, we present a lazy instance-specific algorithm called ISA that performs selective model averaging over a restricted class of Bayesian networks. On experimental evaluation, this algorithm shows superior performance over model selection. We intend to apply such instance-specific algorithms to improve the performance of patient-specific predictive models induced from medical data.


Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes

Neural Information Processing Systems

We propose the hierarchical Dirichlet process (HDP), a nonparametric Bayesian model for clustering problems involving multiple groups of data. Each group of data is modeled with a mixture, with the number of components being open-ended and inferred automatically by the model. Further, components can be shared across groups, allowing dependencies across groups to be modeled effectively as well as conferring generalization to new groups. Such grouped clustering problems occur often in practice, e.g. in the problem of topic discovery in document corpora. We report experimental results on three text corpora showing the effective and superior performance of the HDP over previous models.


Constraining a Bayesian Model of Human Visual Speed Perception

Neural Information Processing Systems

It has been demonstrated that basic aspects of human visual motion perception are qualitatively consistent with a Bayesian estimation framework, where the prior probability distribution on velocity favors slow speeds. Here, we present a refined probabilistic model that can account for the typical trial-to-trial variabilities observed in psychophysical speed perception experiments. We also show that data from such experiments can be used to constrain both the likelihood and prior functions of the model. Specifically, we measured matching speeds and thresholds in a two-alternative forced choice speed discrimination task. Parametric fits to the data reveal that the likelihood function is well approximated by a LogNormal distribution with a characteristic contrast-dependent variance, and that the prior distribution on velocity exhibits significantly heavier tails than a Gaussian, and approximately follows a power-law function.


Fast Rates to Bayes for Kernel Machines

Neural Information Processing Systems

We establish learning rates to the Bayes risk for support vector machines (SVMs) with hinge loss. In particular, for SVMs with Gaussian RBF kernels we propose a geometric condition for distributions which can be used to determine approximation properties of these kernels. Finally, we compare our methods with a recent paper of G. Blanchard et al..


Dynamic Bayesian Networks for Brain-Computer Interfaces

Neural Information Processing Systems

We describe an approach to building brain-computer interfaces (BCI) based on graphical models for probabilistic inference and learning. We show how a dynamic Bayesian network (DBN) can be used to infer probability distributions over brain-and body-states during planning and execution of actions. The DBN is learned directly from observed data and allows measured signals such as EEG and EMG to be interpreted in terms of internal states such as intent to move, preparatory activity, and movement execution. Unlike traditional classification-based approaches to BCI, the proposed approach (1) allows continuous tracking and prediction of internal states over time, and (2) generates control signals based on an entire probability distribution over states rather than binary yes/no decisions. We present preliminary results of brain-and body-state estimation using simultaneous EEG and EMG signals recorded during a self-paced left/right hand movement task.


On the Adaptive Properties of Decision Trees

Neural Information Processing Systems

Decision trees are surprisingly adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; (3) reject irrelevant features. In this paper we examine a decision tree based on dyadic splits that adapts to each of these conditions to achieve minimax optimal rates of convergence. The proposed classifier is the first known to achieve these optimal rates while being practical and implementable.


Assignment of Multiplicative Mixtures in Natural Images

Neural Information Processing Systems

In the analysis of natural images, Gaussian scale mixtures (GSM) have been used to account for the statistics of filter responses, and to inspire hierarchical cortical representational learning schemes. GSMs pose a critical assignment problem, working out which filter responses were generated by a common multiplicative factor. We present a new approach to solving this assignment problem through a probabilistic extension to the basic GSM, and show how to perform inference in the model using Gibbs sampling. We demonstrate the efficacy of the approach on both synthetic and image data. Understanding the statistical structure of natural images is an important goal for visual neuroscience. Neural representations in early cortical areas decompose images (and likely other sensory inputs) in a way that is sensitive to sophisticated aspects of their probabilistic structure. This structure also plays a key role in methods for image processing and coding. A striking aspect of natural images that has reflections in both top-down and bottom-up modeling is coordination across nearby locations, scales, and orientations. From a topdown perspective, this structure has been modeled using what is known as a Gaussian Scale Mixture model (GSM).