Stochastic Gradient MCMC with Multi-Armed Bandit Tuning
Coullon, Jeremie, South, Leah, Nemeth, Christopher
Most MCMC algorithms contain user-controlled hyperparameters which need to be carefully selected to ensure that the MCMC algorithm explores the posterior distribution efficiently. Optimal tuning rates for many popular MCMC algorithms such the random-walk (Gelman et al., 1997) or Metropolis-adjusted Langevin algorithms (Roberts and Rosenthal, 1998) rely on setting the tuning parameters according to the Metropolis-Hastings acceptance rate. Using metrics such as the acceptance rate, hyperparameters can be optimized on-the-fly within the MCMC algorithm using adaptive MCMC (Andrieu and Thoms, 2008; Vihola, 2012). However, in the context of stochastic gradient MCMC (SGMCMC), there is no acceptance rate to tune against and the trade-off between bias and variance for a fixed computational budget means that tuning approaches designed for target invariant MCMC algorithms are not applicable. Related work Previous adaptive SGMCMC algorithms have focused on embedding ideas from the optimization literature within the SGMCMC framework, e.g.
May-28-2021