Large-Scale Stochastic Sampling from the Probability Simplex
Baker, Jack, Fearnhead, Paul, Fox, Emily, Nemeth, Christopher
–Neural Information Processing Systems
Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space the time-discretization error can dominate when we are near the boundary of the space. We demonstrate that because of this, current SGMCMC methods for the simplex struggle with sparse simplex spaces; when many of the components are close to zero. Unfortunately, many popular large-scale Bayesian models, such as network or topic models, require inference on sparse simplex spaces.
Neural Information Processing Systems
Feb-14-2020, 18:55:50 GMT