Directed Networks
Nonparametric inference of prior probabilities from Bayes-optimal behavior
We discuss a method for obtaining a subject's a priori beliefs from his/her behavior in a psychophysics context, under the assumption that the behavior is (nearly) optimal from a Bayesian perspective. The method is nonparametric in the sense that we do not assume that the prior belongs to any fixed class of distributions (e.g., Gaussian). Despite this increased generality, the method is relatively simple to implement, being based in the simplest case on a linear programming algorithm, and more generally on a straightforward maximum likelihood or maximum a posteriori formulation, which turns out to be a convex optimization problem (with no non-global local maxima) in many important cases. In addition, we develop methods for analyzing the uncertainty of these estimates. We demonstrate the accuracy of the method in a simple simulated coin-flipping setting; in particular, the method is able to precisely track the evolution of the subject's posterior distribution as more and more data are observed. We close by briefly discussing an interesting connection to recent models of neural population coding.
Variational EM Algorithms for Non-Gaussian Latent Variable Models
Palmer, Jason, Kreutz-Delgado, Kenneth, Rao, Bhaskar D., Wipf, David P.
We consider criteria for variational representations of non-Gaussian latent variables, and derive variational EM algorithms in general form. We establish a general equivalence among convex bounding methods, evidence based methods, and ensemble learning/Variational Bayes methods, which has previously been demonstrated only for particular cases.
Bayesian model learning in human visual perception
Orbรกn, Gergล, Fiser, Jozsef, Aslin, Richard N., Lengyel, Mรกtรฉ
Humans make optimal perceptual decisions in noisy and ambiguous conditions. Computations underlying such optimal behavior have been shown to rely on probabilistic inference according to generative models whose structure is usually taken to be known a priori. We argue that Bayesian model selection is ideal for inferring similar and even more complex model structures from experience. We find in experiments that humans learn subtle statistical properties of visual scenes in a completely unsupervised manner. We show that these findings are well captured by Bayesian model learning within a class of models that seek to explain observed variables by independent hidden causes.
A Bayesian Spatial Scan Statistic
Neill, Daniel B., Moore, Andrew W., Cooper, Gregory F.
We propose a new Bayesian method for spatial cluster detection, the "Bayesian spatial scan statistic," and compare this method to the standard (frequentist) scan statistic approach. We demonstrate that the Bayesian statistic has several advantages over the frequentist approach, including increased power to detect clusters and (since randomization testing is unnecessary) much faster runtime. We evaluate the Bayesian and frequentist methods on the task of prospective disease surveillance: detecting spatial clusters of disease cases resulting from emerging disease outbreaks. We demonstrate that our Bayesian methods are successful in rapidly detecting outbreaks while keeping number of false positives low.
Nested sampling for Potts models
Murray, Iain, MacKay, David, Ghahramani, Zoubin, Skilling, John
Nested sampling is a new Monte Carlo method by Skilling [1] intended for general Bayesian computation. Nested sampling provides a robust alternative to annealing-based methods for computing normalizing constants. It can also generate estimates of other quantities such as posterior expectations. The key technical requirement is an ability to draw samples uniformly from the prior subject to a constraint on the likelihood. We provide a demonstration with the Potts model, an undirected graphical model.
Context as Filtering
Mochihashi, Daichi, Matsumoto, Yuji
Long-distance language modeling is important not only in speech recognition and machine translation, but also in high-dimensional discrete sequence modeling in general. However, the problem of context length has almost been neglected so far and a naรฏve bag-of-words history has been employed in natural language processing. In contrast, in this paper we view topic shifts within a text as a latent stochastic process to give an explicit probabilistic generative model that has partial exchangeability. We propose an online inference algorithm using particle filters to recognize topic shifts to employ the most appropriate length of context automatically. Experiments on the BNC corpus showed consistent improvement over previous methods involving no chronological order.
Unbiased Estimator of Shape Parameter for Spiking Irregularities under Changing Environments
Miura, Keiji, Okada, Masato, Amari, Shun-ichi
We considered a gamma distribution of interspike intervals as a statistical model for neuronal spike generation. The model parameters consist of a time-dependent firing rate and a shape parameter that characterizes spiking irregularities of individual neurons. Because the environment changes with time, observed data are generated from the time-dependent firing rate, which is an unknown function. A statistical model with an unknown function is called a semiparametric model, which is one of the unsolved problem in statistics and is generally very difficult to solve. We used a novel method of estimating functions in information geometry to estimate the shape parameter without estimating the unknown function. We analytically obtained an optimal estimating function for the shape parameter independent of the functional form of the firing rate. This estimation is efficient without Fisher information loss and better than maximum likelihood estimation.
Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification
Kapoor, Ashish, Ahn, Hyungil, Qi, Yuan, Picard, Rosalind W.
There have been many graph-based approaches for semi-supervised classification. One problem is that of hyperparameter learning: performance depends greatly on the hyperparameters of the similarity graph, transformation of the graph Laplacian and the noise model. We present a Bayesian framework for learning hyperparameters for graph-based semisupervised classification. Given some labeled data, which can contain inaccurate labels, we pose the semi-supervised classification as an inference problem over the unknown labels. Expectation Propagation is used for approximate inference and the mean of the posterior is used for classification. The hyperparameters are learned using EM for evidence maximization. We also show that the posterior mean can be written in terms of the kernel matrix, providing a Bayesian classifier to classify new points. Tests on synthetic and real datasets show cases where there are significant improvements in performance over the existing approaches.
Worst-Case Bounds for Gaussian Process Models
Kakade, Sham M., Seeger, Matthias W., Foster, Dean P.
We present a competitive analysis of some nonparametric Bayesian algorithms in a worst-case online learning setting, where no probabilistic assumptions about the generation of the data are made. We consider models which use a Gaussian process prior (over the space of all functions) and provide bounds on the regret (under the log loss) for commonly used nonparametric Bayesian algorithms -- including Gaussian regression and logistic regression -- which show how these algorithms can perform favorably under rather general conditions. These bounds explicitly handle the infinite dimensionality of these nonparametric classes in a natural way. We also make formal connections to the minimax and minimum description length (MDL) framework. Here, we show precisely how Bayesian Gaussian regression is a minimax strategy.