Jonathan Huggins
Random Feature Stein Discrepancies
Jonathan Huggins, Lester Mackey
Computable Stein discrepancies have been deployed for a variety of applications, ranging from sampler selection in posterior inference to approximate Bayesian inference to goodness-of-fit testing. Existing convergence-determining Stein discrepancies admit strong theoretical guarantees but suffer from a computational cost that grows quadratically in the sample size. While linear-time Stein discrepancies have been proposed for goodness-of-fit testing, they exhibit avoidable degradations in testing power--even when power is explicitly optimized. To address these shortcomings, we introduce feature Stein discrepancies (SDs), a new family of quality measures that can be cheaply approximated using importance sampling. We show how to construct SDs that provably determine the convergence of a sample to its target and develop high-accuracy approximations--random SDs (R SDs)--which are computable in near-linear time. In our experiments with sampler selection for approximate posterior inference and goodness-of-fit testing, R SDs perform as well or better than quadratic-time KSDs while being orders of magnitude faster to compute.
Coresets for Scalable Bayesian Logistic Regression
Jonathan Huggins, Trevor Campbell, Tamara Broderick
The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification.
Random Feature Stein Discrepancies
Jonathan Huggins, Lester Mackey
Computable Stein discrepancies have been deployed for a variety of applications, ranging from sampler selection in posterior inference to approximate Bayesian inference to goodness-of-fit testing. Existing convergence-determining Stein discrepancies admit strong theoretical guarantees but suffer from a computational cost that grows quadratically in the sample size. While linear-time Stein discrepancies have been proposed for goodness-of-fit testing, they exhibit avoidable degradations in testing power--even when power is explicitly optimized. To address these shortcomings, we introduce feature Stein discrepancies (SDs), a new family of quality measures that can be cheaply approximated using importance sampling. We show how to construct SDs that provably determine the convergence of a sample to its target and develop high-accuracy approximations--random SDs (R SDs)--which are computable in near-linear time. In our experiments with sampler selection for approximate posterior inference and goodness-of-fit testing, R SDs perform as well or better than quadratic-time KSDs while being orders of magnitude faster to compute.