Goto

Collaborating Authors

 ccadl



Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

The paper introduces covariance-controlled adaptive Langevin thermostat (CCAdL), a Bayesian sampling method based on stochastic gradients (SG) that aims to account for correlated errors introduced by the SG approximation of the true gradient. The authors demonstrate that CCAdL is more accurate and robust than other SG based methods on various test problems. In general, the paper is well written but sometimes a bit hard to follow for someone who is not familiar with these type of sampling algorithms. The paper starts by reviewing various SG methods for efficient Bayesian posterior sampling (SGDL, mSGDL, SGHMC, SGHNT). It would be quite helpful if the authors could provide, for example, a table or figure that gives on overview over the different SG variants and highlights their commonalities and differences.


Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling

Neural Information Processing Systems

Monte Carlo sampling for Bayesian posterior inference is a common approach used in machine learning. The Markov Chain Monte Carlo procedures that are used are often discrete-time analogues of associated stochastic differential equations (SDEs). These SDEs are guaranteed to leave invariant the required posterior distribution. An area of current research addresses the computational benefits of stochastic gradient methods in this setting. Existing techniques rely on estimating the variance or covariance of the subsampling error, and typically assume constant variance. In this article, we propose a covariance-controlled adaptive Langevin thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. The proposed method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.


Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling

Shang, Xiaocheng, Zhu, Zhanxing, Leimkuhler, Benedict, Storkey, Amos J.

Neural Information Processing Systems

Monte Carlo sampling for Bayesian posterior inference is a common approach used in machine learning. The Markov Chain Monte Carlo procedures that are used are often discrete-time analogues of associated stochastic differential equations (SDEs). These SDEs are guaranteed to leave invariant the required posterior distribution. An area of current research addresses the computational benefits of stochastic gradient methods in this setting. Existing techniques rely on estimating the variance or covariance of the subsampling error, and typically assume constant variance. In this article, we propose a covariance-controlled adaptive Langevin thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. The proposed method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.


Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling

Shang, Xiaocheng, Zhu, Zhanxing, Leimkuhler, Benedict, Storkey, Amos J.

arXiv.org Machine Learning

Monte Carlo sampling for Bayesian posterior inference is a common approach used in machine learning. The Markov Chain Monte Carlo procedures that are used are often discrete-time analogues of associated stochastic differential equations (SDEs). These SDEs are guaranteed to leave invariant the required posterior distribution. An area of current research addresses the computational benefits of stochastic gradient methods in this setting. Existing techniques rely on estimating the variance or covariance of the subsampling error, and typically assume constant variance. In this article, we propose a covariance-controlled adaptive Langevin thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. The proposed method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.