Uncertainty
Bayesian Monte Carlo
Ghahramani, Zoubin, Rasmussen, Carl E.
We investigate Bayesian alternatives to classical Monte Carlo methods for evaluating integrals. Bayesian Monte Carlo (BMC) allows the incorporation ofprior knowledge, such as smoothness of the integrand, into the estimation. In a simple problem we show that this outperforms any classical importance sampling method. We also attempt more challenging multidimensionalintegrals involved in computing marginal likelihoods ofstatistical models (a.k.a.
A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics
Allender, Eric, Arora, Sanjeev, Kearns, Michael, Moore, Cristopher, Russell, Alexander
We establish a new hardness result that shows that the difficulty of planning infactored Markov decision processes is representational rather than just computational. More precisely, we give a fixed family of factored MDPswith linear rewards whose optimal policies and value functions simply cannot be represented succinctly in any standard parametric form. Previous hardness results indicated that computing good policies from the MDP parameters was difficult, but left open the possibility of succinct function approximation for any fixed factored MDP. Our result applies even to policies which yield a polynomially poor approximation to the optimal value, and highlights interesting connections with the complexity classof Arthur-Merlin games.
Fractional Belief Propagation
We consider loopy belief propagation for approximate inference in probabilistic graphicalmodels. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate freeenergies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction ofthe clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
Conditional Models on the Ranking Poset
Lebanon, Guy, Lafferty, John D.
A distance-based conditional model on the ranking poset is presented for use in classification and ranking. The model is an extension of the Mallows model, and generalizes the classifier combination methods used by several ensemble learning algorithms, including error correcting output codes, discrete AdaBoost, logistic regression and cranking. The algebraic structure of the ranking poset leads to a simple Bayesian interpretation ofthe conditional model and its special cases. In addition to a unifying view, the framework suggests a probabilistic interpretation for error correcting output codes and an extension beyond the binary coding scheme.
The Effect of Singularities in a Learning Machine when the True Parameters Do Not Lie on such Singularities
Watanabe, Sumio, Amari, Shun-ichi
A lot of learning machines with hidden variables used in information sciencehave singularities in their parameter spaces. At singularities, the Fisher information matrix becomes degenerate, resulting that the learning theory of regular statistical models does not hold. Recently, it was proven that, if the true parameter is contained in singularities, then the coefficient of the Bayes generalization erroris equal to the pole of the zeta function of the Kullback information.
Data-Dependent Bounds for Bayesian Mixture Methods
We consider Bayesian mixture approaches, where a predictor is constructed by forming a weighted average of hypotheses from some space of functions. While such procedures are known to lead to optimal predictors in several cases, where sufficiently accurate prior information is available, it has not been clear how they perform when some of the prior assumptions are violated. In this paper we establish data-dependent bounds for such procedures, extending previous randomized approaches such as the Gibbs algorithm to a fully Bayesian setting. The finite-sample guarantees established in this work enable the utilization of Bayesian mixture approaches in agnostic settings, where the usual assumptions of the Bayesian paradigm fail to hold. Moreover, the bounds derived can be directly applied to non-Bayesian mixture approaches such as Bagging and Boosting.
Evidence Optimization Techniques for Estimating Stimulus-Response Functions
Sahani, Maneesh, Linden, Jennifer F.
An essential step in understanding the function of sensory nervous systems isto characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. Oneincreasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses ofone or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal firing. The estimation techniques usually employed, such as Wiener filtering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. We explore the use of Bayesian evidence-optimization techniques to condition these estimates. We show that by learning hyperparameters thatcontrol the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input.
Bayesian Models of Inductive Generalization
Sanjana, Neville E., Tenenbaum, Joshua B.
We argue that human inductive generalization is best explained in a Bayesian framework, rather than by traditional models based on similarity computations.We go beyond previous work on Bayesian concept learning by introducing an unsupervised method for constructing flexible hypothesisspaces, and we propose a version of the Bayesian Occam's razorthat trades off priors and likelihoods to prevent under-or over-generalization in these flexible spaces. We analyze two published data sets on inductive reasoning as well as the results of a new behavioral study that we have carried out.
Theory-Based Causal Inference
Tenenbaum, Joshua B., Griffiths, Thomas L.
People routinely make sophisticated causal inferences unconsciously, effortlessly, andfrom very little data - often from just one or a few observations. Weargue that these inferences can be explained as Bayesian computations over a hypothesis space of causal graphical models, shaped by strong top-down prior knowledge in the form of intuitive theories.