Uncertainty
Regulator Discovery from Gene Expression Time Series of Malaria Parasites: a Hierachical Approach
Hernández-lobato, José M., Dijkstra, Tjeerd, Heskes, Tom
We introduce a hierarchical Bayesian model for the discovery of putative regulators from gene expression data only. The hierarchy incorporates the knowledge that there are just a few regulators that by themselves only regulate a handful of genes. This is implemented through a so-called spike-and-slab prior, a mixture of Gaussians with different widths, with mixing weights from a hierarchical Bernoulli model. For efficient inference we implemented expectation propagation. Running the model on a malaria parasite data set, we found four genes with significant homology to transcription factors in an amoebe, one RNA regulator and three genes of unknown function (out of the top ten genes considered).
Convex Relaxations of Latent Variable Training
We investigate a new, convex relaxation of an expectation-maximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependence on the missing values is retained. Although this appears to be a strong negative outcome, we then demonstrate how the problem can be bypassed by using equivalence relations instead ofvalue assignments over hidden variables. In particular, we develop new algorithms for estimating exponential conditional models that only require equivalence relationinformation over the variable values. This reformulation leads to an exact expression for EM variants in a wide range of problems. We then develop a semidefinite relaxation that yields global training by eliminating local minima.
Expectation Maximization and Posterior Constraints
Ganchev, Kuzman, Taskar, Ben, Gama, João
The expectation maximization (EM) algorithm is a widely used maximum likelihood estimationprocedure for statistical models when the values of some of the variables in the model are not observed. Very often, however, our aim is primarily tofind a model that assigns values to the latent variables that have intended meaning for our data and maximizing expected likelihood only sometimes accomplishes this.Unfortunately, it is typically difficult to add even simple a-priori information about latent variables in graphical models without making the models overly complex or intractable. In this paper, we present an efficient, principled way to inject rich constraints on the posteriors of latent variables into the EM algorithm. Our method can be used to learn tractable graphical models that satisfy additional,otherwise intractable constraints. Focusing on clustering and the alignment problem for statistical machine translation, we show that simple, intuitive posteriorconstraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models.
Learning Horizontal Connections in a Sparse Coding Model of Natural Images
Garrigues, Pierre, Olshausen, Bruno A.
It has been shown that adapting a dictionary of basis functions to the statistics of natural images so as to maximize sparsity in the coefficients results in a set of dictionary elements whose spatial properties resemble those of V1 (primary visual cortex) receptive fields. However, the resulting sparse coefficients still exhibit pronounced statistical dependencies, thus violating the independence assumption of the sparse coding model. Here, we propose a model that attempts to capture the dependencies among the basis function coefficients by including a pairwise coupling term in the prior over the coefficient activity states. When adapted to the statistics of natural images, the coupling terms learn a combination of facilitatory and inhibitory interactions among neighboring basis functions. These learned interactions may offer an explanation for the function of horizontal connections in V1, and we discuss the implications of our findings for physiological experiments.
Catching Up Faster in Bayesian Model Selection and Model Averaging
Erven, Tim V., Rooij, Steven D., Grünwald, Peter
Bayesian model averaging, model selection and their approximations such as BIC are generally statistically consistent, but sometimes achieve slower rates of convergence thanother methods such as AIC and leave-one-out cross-validation. On the other hand, these other methods can be inconsistent. We identify the catchup phenomenon as a novel explanation for the slow convergence of Bayesian methods. Basedon this analysis we define the switch-distribution, a modification of the Bayesian model averaging distribution. We prove that in many situations model selection and prediction based on the switch-distribution is both consistent and achieves optimal convergence rates, thereby resolving the AIC-BIC dilemma. The method is practical; we give an efficient algorithm.
A Probabilistic Approach to Language Change
Bouchard-côté, Alexandre, Liang, Percy S., Klein, Dan, Griffiths, Thomas L.
We present a probabilistic approach to language change in which word forms are represented by phoneme sequences that undergo stochastic edits along the branches of a phylogenetic tree. Our framework combines the advantages of the classical comparative method with the robustness of corpus-based probabilistic models. We use this framework to explore the consequences of two different schemes for defining probabilistic models of phonological change, evaluating these schemes using the reconstruction of ancient word forms in Romance languages. The result is an efficient inference procedure for automatically inferring ancient word forms from modern languages, which can be generalized to support inferences about linguistic phylogenies.