Uncertainty
Semi-supervised Learning by Entropy Minimization
Grandvalet, Yves, Bengio, Yoshua
We consider the semi-supervised learning problem, where a decision rule is to be learned from labeled and unlabeled data. In this framework, we motivate minimum entropy regularization, which enables to incorporate unlabeled data in the standard supervised learning. Our approach includes other approaches to the semi-supervised problem as particular or limiting cases. A series of experiments illustrates that the proposed solution benefits from unlabeled data. The method challenges mixture models when the data are sampled from the distribution class spanned by the generative model. The performances are definitely in favor of minimum entropy regularization when generative models are misspecified, and the weighting of unlabeled data provides robustness to the violation of the "cluster assumption". Finally, we also illustrate that the method can also be far superior to manifold learning in high dimension spaces.
Probabilistic Computation in Spiking Populations
Zemel, Richard S., Natarajan, Rama, Dayan, Peter, Huys, Quentin J.
As animals interact with their environments, they must constantly update estimates about their states. Bayesian models combine prior probabilities, a dynamical model and sensory evidence to update estimates optimally. These models are consistent with the results of many diverse psychophysical studies. However, little is known about the neural representation and manipulation of such Bayesian information, particularly in populations of spiking neurons. We consider this issue, suggesting a model based on standard neural architecture and activations. We illustrate the approach on a simple random walk example, and apply it to a sensorimotor integration task that provides a particularly compelling example of dynamic probabilistic computation. Bayesian models have been used to explain a gamut of experimental results in tasks which require estimates to be derived from multiple sensory cues.
Conditional Random Fields for Object Recognition
Quattoni, Ariadna, Collins, Michael, Darrell, Trevor
We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e.
Similarity and Discrimination in Classical Conditioning: A Latent Variable Account
Courville, Aaron C., Daw, Nathaniel D., Touretzky, David S.
We propose a probabilistic, generative account of configural learning phenomena in classical conditioning. Configural learning experiments probe how animals discriminate and generalize between patterns of simultaneously presented stimuli (such as tones and lights) that are differentially predictive of reinforcement. Previous models of these issues have been successful more on a phenomenological than an explanatory level: they reproduce experimental findings but, lacking formal foundations, provide scant basis for understanding why animals behave as they do. We present a theory that clarifies seemingly arbitrary aspects of previous models while also capturing a broader set of data.
A Probabilistic Model for Online Document Clustering with Application to Novelty Detection
Zhang, Jian, Ghahramani, Zoubin, Yang, Yiming
In this paper we propose a probabilistic model for online document clustering. We use nonparametric Dirichlet process prior to model the growing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichletmultinomial distribution. We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature.
The Rescorla-Wagner Algorithm and Maximum Likelihood Estimation of Causal Parameters
This paper analyzes generalization of the classic Rescorla-Wagner (R-W) learning algorithm and studies their relationship to Maximum Likelihood estimation of causal parameters. We prove that the parameters of two popular causal models, P and P C, can be learnt by the same generalized linear Rescorla-Wagner (GLRW) algorithm provided genericity conditions apply. We characterize the fixed points of these GLRW algorithms and calculate the fluctuations about them, assuming that the input is a set of i.i.d.
Generative Affine Localisation and Tracking
We present an extension to the Jojic and Frey (2001) layered sprite model which allows for layers to undergo affine transformations. This extension allows for affine object pose to be inferred whilst simultaneously learning the object shape and appearance. Learning is carried out by applying an augmented variational inference algorithm which includes a global search over a discretised transform space followed by a local optimisation. To aid correct convergence, we use bottom-up cues to restrict the space of possible affine transformations. We present results on a number of video sequences and show how the model can be extended to track an object whose appearance changes throughout the sequence.
Exponential Family Harmoniums with an Application to Information Retrieval
Welling, Max, Rosen-zvi, Michal, Hinton, Geoffrey E.
Directed graphical models with one layer of observed random variables and one or more layers of hidden random variables have been the dominant modelling paradigm in many research fields. Although this approach has met with considerable success, the causal semantics of these models can make it difficult to infer the posterior distribution over the hidden variables. In this paper we propose an alternative two-layer model based on exponential family distributions and the semantics of undirected models. Inference in these "exponential family harmoniums" is fast while learning is performed by minimizing contrastive divergence. A member of this family is then studied as an alternative probabilistic model for latent semantic indexing. In experiments it is shown that they perform well on document retrieval tasks and provide an elegant solution to searching with keywords.
Identifying Protein-Protein Interaction Sites on a Genome-Wide Scale
Wang, Haidong, Segal, Eran, Ben-Hur, Asa, Koller, Daphne, Brutlag, Douglas L.
Protein interactions typically arise from a physical interaction of one or more small sites on the surface of the two proteins. Identifying these sites is very important for drug and protein design. In this paper, we propose a computational method based on probabilistic relational model that attempts to address this task using high-throughput protein interaction data and a set of short sequence motifs. We learn the model using the EM algorithm, with a branch-and-bound algorithm as an approximate inference for the E-step. Our method searches for motifs whose presence in a pair of interacting proteins can explain their observed interaction. It also tries to determine which motif pairs have high affinity, and can therefore lead to an interaction. We show that our method is more accurate than others at predicting new protein-protein interactions. More importantly, by examining solved structures of protein complexes, we find that 2/3 of the predicted active motifs correspond to actual interaction sites.