Bayesian Learning
Unified Inference for Variational Bayesian Linear Gaussian State-Space Models
Barber, David, Chiappa, Silvia
Linear Gaussian State-Space Models are widely used and a Bayesian treatment of parameters is therefore of considerable interest. The approximate Variational Bayesian method applied to these models is an attractive approach, used successfully in applications ranging from acoustics to bioinformatics. The most challenging aspect of implementing the method is in performing inference on the hidden state sequence of the model. We show how to convert the inference problem so that standard Kalman Filtering/Smoothing recursions from the literature may be applied. This is in contrast to previously published approaches based on Belief Propagation. Our framework both simplifies and unifies the inference problem, so that future applications may be more easily developed. We demonstrate the elegance of the approach on Bayesian temporal ICA, with an application to finding independent dynamical processes underlying noisy EEG signals.
Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models
Johnson, Mark, Griffiths, Thomas L., Goldwater, Sharon
This paper introduces adaptor grammars, a class of probabilistic models of language thatgeneralize probabilistic context-free grammars (PCFGs). Adaptor grammars augment the probabilistic rules of PCFGs with "adaptors" that can induce dependenciesamong successive uses. With a particular choice of adaptor, based on the Pitman-Yor process, nonparametric Bayesian models of language using Dirichlet processes and hierarchical Dirichlet processes can be written as simple grammars. We present a general-purpose inference algorithm for adaptor grammars, making it easy to define and use such models, and illustrate how several existing nonparametric Bayesian models can be expressed within this framework.
Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
Haro, Gloria, Randall, Gregory, Sapiro, Guillermo
The study of point cloud data sampled from a stratification, a collection of manifolds withpossible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality anddensity of such structures. The framework is based on a maximum likelihood estimationof a Poisson mixture model. The presentation of the approach is completed with artificial and real examples demonstrating the importance of extending manifold learning to stratification learning.
Bayesian Policy Gradient Algorithms
Ghavamzadeh, Mohammad, Engel, Yaakov
Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policyby following a performance gradient estimate. Conventional policy gradient methods use Monte-Carlo techniques to estimate this gradient. Since Monte Carlo methods tend to have high variance, a large number of samples is required, resulting in slow convergence. In this paper, we propose a Bayesian framework that models the policy gradient as a Gaussian process. This reduces the number of samples needed to obtain accurate gradient estimates. Moreover, estimates of the natural gradient as well as a measure of the uncertainty in the gradient estimates are provided at little extra cost.
Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization
Ramírez, Rey, Palmer, Jason, Makeig, Scott, Rao, Bhaskar D., Wipf, David P.
The ill-posed nature of the MEG/EEG source localization problem requires the incorporation of prior assumptions when choosing an appropriate solution out of an infinite set of candidates. Bayesian methods are useful in this capacity because they allow these assumptions to be explicitly quantified. Recently, a number of empirical Bayesian approaches have been proposed that attempt a form of model selection by using the data to guide the search for an appropriate prior. While seemingly quite different in many respects, we apply a unifying framework based on automatic relevance determination (ARD) that elucidates various attributes of these methods and suggests directions for improvement. We also derive theoretical propertiesof this methodology related to convergence, local minima, and localization bias and explore connections with established algorithms.
Inferring Network Structure from Co-Occurrences
Rabbat, Michael G., Figueiredo, Mário, Nowak, Robert
We consider the problem of inferring the structure of a network from cooccurrence data:observations that indicate which nodes occur in a signaling pathway but do not directly reveal node order within the pathway. This problem is motivated by network inference problems arising in computational biology and communication systems, in which it is difficult or impossible to obtain precise time ordering information. Without order information, every permutation of the activated nodes leads to a different feasible solution, resulting in combinatorial explosion of the feasible set. However, physical principles underlying most networked systemssuggest that not all feasible solutions are equally likely. Intuitively, nodes that cooccur more frequently are probably more closely connected. Building on this intuition, we model path co-occurrences as randomly shuffled samples of a random walk on the network. We derive a computationally efficient network inference algorithm and, via novel concentration inequalities for importance samplingestimators, prove that a polynomial complexity Monte Carlo version of the algorithm converges with high probability.