The reproducing Stein kernel approach for post-hoc corrected sampling

Hodgkinson, Liam, Salomone, Robert, Roosta, Fred

Jan-25-2020–arXiv.org Machine Learning

The reproducing Stein kernel approach for post-hoc corrected sampling Liam Hodgkinson 1,, Robert Salomone 2,, and Fred Roosta 3,, † 1 Department of Statistics, UC Berkeley, Berkeley, CA, 94720, USA. Abstract: Stein importance sampling [42] is a widely applicable technique based on kernelized Stein discrepancy [43], which corrects the output of approximate sampling algorithms by reweighting the empirical distribution of the samples. A general analysis of this technique is conducted for the previously unconsidered setting where samples are obtained via the simulation of a Markov chain, and applies to an arbitrary underlying Polish space. We prove that Stein importance sampling yields consistent estimators for quantities related to a target distribution of interest by using samples obtained from a geometrically ergodic Markov chain with a possibly unknown invariant measure that differs from the desired target. The approach is shown to be valid under conditions that are satisfied for a large number of unadjusted samplers, and is capable of retaining consistency when data subsampling is used. Along the way, a universal theory of reproducing Stein kernels is established, which enables the construction of kernelized Stein discrepancy on general Polish spaces, and provides sufficient conditions for kernels to be convergence-determining on such spaces. These results are of independent interest for the development of future methodology based on kernelized Stein discrepancies. 1. Introduction Our problem of interest is the efficient computation of integrals with respect to some target probability measure π . Adopting the Monte Carlo approach, π is approximated by an empirical distribution formed from samples drawn according to π . However, in many problems of interest, it is not possible to simulate according to π exactly, and so further approximate methods must be used. Arguably the most widely employed and general approach is Markov Chain Monte Carlo (MCMC); successively drawing samples as a realization of a Markov chain. The dominant approach to MCMC involves the simulation of a process that is π -ergodic, often constructed by the Metropolis-Hastings algorithm from an underlying irreducible and aperiodic Markov chain [58]. However, there has been significant recent interest in so-called unadjusted MCMC approaches [14, 19, 29, 45]. A common strategy with these methods is the approximate numer-All authors are supported in part by the Australian Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), under Australian Research Council grant CE140100049. For the same computational effort, one can achieve substantially lower variance of estimates at the cost of incurring additional (asymptotic) bias. Despite poorer asymptotic guarantees [21], the ensuing Markov chains are often rapidly mixing, and perform particularly well in high dimensional settings [20].

kernel, roosta post-hoc, stein kernel, (16 more...)

arXiv.org Machine Learning

Jan-25-2020

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Queensland (0.04)
  - New South Wales > Kensington (0.04)
- North America > United States
  - California > Alameda County > Berkeley (0.24)
- Europe > United Kingdom
  - England
    - Oxfordshire > Oxford (0.04)
    - Cambridgeshire > Cambridge (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Japan > Honshū
    - Kantō > Kanagawa Prefecture (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found