On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods

arXiv.org Machine Learning

A large number of statistical models are "doubly-intractable": the likelihood normalising term, which is a function of the model parameters, is intractable, as well as the marginal likelihood (model evidence). This means that standard inference techniques to sample from the posterior, such as Markov chain Monte Carlo (MCMC), cannot be used. Examples include, but are not confined to, massive Gaussian Markov random fields, autologistic models and Exponential random graph models. A number of approximate schemes based on MCMC techniques, Approximate Bayesian computation (ABC) or analytic approximations to the posterior have been suggested, and these are reviewed here. Exact MCMC schemes, which can be applied to a subset of doubly-intractable distributions, have also been developed and are described in this paper. As yet, no general method exists which can be applied to all classes of models with doubly-intractable posteriors. In addition, taking inspiration from the Physics literature, we study an alternative method based on representing the intractable likelihood as an infinite series. Unbiased estimates of the likelihood can then be obtained by finite time stochastic truncation of the series via Russian Roulette sampling, although the estimates are not necessarily positive. Results from the Quantum Chromodynamics literature are exploited to allow the use of possibly negative estimates in a pseudo-marginal MCMC scheme such that expectations with respect to the posterior distribution are preserved. The methodology is reviewed on well-known examples such as the parameters in Ising models, the posterior for Fisher-Bingham distributions on the $d$-Sphere and a large-scale Gaussian Markov Random Field model describing the Ozone Column data. This leads to a critical assessment of the strengths and weaknesses of the methodology with pointers to ongoing research.


MLE-induced Likelihood for Markov Random Fields

arXiv.org Machine Learning

Due to the intractable partition function, the exact likelihood function for a Markov random field (MRF), in many situations, can only be approximated. Major approximation approaches include pseudolikelihood and Laplace approximation. In this paper, we propose a novel way of approximating the likelihood function through first approximating the marginal likelihood functions of individual parameters and then reconstructing the joint likelihood function from these marginal likelihood functions. For approximating the marginal likelihood functions, we derive a particular likelihood function from a modified scenario of coin tossing which is useful for capturing how one parameter interacts with the remaining parameters in the likelihood function. For reconstructing the joint likelihood function, we use an appropriate copula to link up these marginal likelihood functions. Numerical investigation suggests the superior performance of our approach. Especially as the size of the MRF increases, both the numerical performance and the computational cost of our approach remain consistently satisfactory, whereas Laplace approximation deteriorates and pseudolikelihood becomes computationally unbearable.


Accurate Kernel Learning for Linear Gaussian Markov Processes using a Scalable Likelihood Computation

arXiv.org Machine Learning

We report an exact likelihood computation for Linear Gaussian Markov processes that is more scalable than existing algorithms for complex models and sparsely sampled signals. Better scaling is achieved through elimination of repeated computations in the Kalman likelihood, and by using the diagonalized form of the state transition equation. Using this efficient computation, we study the accuracy of kernel learning using maximum likelihood and the posterior mean in a simulation experiment. The posterior mean with a reference prior is more accurate for complex models and sparse sampling. Because of its lower computation load, the maximum likelihood estimator is an attractive option for more densely sampled signals and lower order models. We confirm estimator behavior in experimental data through their application to speleothem data.


Markov Chain Monte-Car o Algorithms for t Calculation of Dempster-Shafer

AAAI Conferences

Dempster-Shafer theory (Shafer 1976, 1990, Dempster 1967) is a promising method for reasoning with uncertain information. The theory involves splitting the uncertain evidence into independent pieces and calculating the combined effect using Dempster's rule of combination. A major problem with this is the computational complexity of Dempster's rule. The straightforward application of the rule is exponential (where the problem parameters are the size of the frame of discernment and the number of evidences). A number of methods have been developed for improving the efficiency, e.g., (Laskey & Lehner 1989, Wilson 1989, Provan 1990, Kennes & Smets 1990) but they are limited by the #P-completeness of Dempster's rule (Orponen 1990).


A Sequence Similarity Search Algorithm Based on a Probabilistic Interpretation of an Alignment Scoring System

AAAI Conferences

We present a probabilistic interpretation of local sequence alignment methods where the alignment scoring system (ASS) plays the role of a stochastic process defining a probability distribution over all sequence pairs. An explicit algorithm is given to compute the probability of two sequences given an ASS. Based on this definition, a modified version of the Smith-Waterman local similarity search algorithm has been devised, which assesses sequence relationships by log likelihood ratios. When tested on classical example such as globins or G-protein-coupled receptors, the new method proved to be up to an order of magnitude more sensitive than the native Smith-Waterman algorithm. Introduction The comparison of a new protein sequence against a database of known proteins is perhaps the most important computer application in molecular sequence analysis.