Goto

Collaborating Authors

 svrg-ld







Response to Reviewer 2: Thanks for your helpful comments

Neural Information Processing Systems

Response to Reviewer 1: Thank you for your supportive comments! We will fix the typos in the final version. "The proposed algorithm is a relatively standard extension of SG-HMC and SGLD. From the perspective of the design of our algorithm, we admit that our algorithm is an extension of SG-HMC. More importantly, the corresponding theoretical guarantees of our algorithm outperform the state-of-the-art. Q2: "The following articles might also be related..." A2: Thank you for pointing out these related articles. We will definitely cite and discuss them in the final version. "Why not show figures that compare these samples against some ground truth, for example, those obtained by HMC (which is feasible to obtain for GMM and ICA)?




Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Kinoshita, Yuri, Suzuki, Taiji

arXiv.org Artificial Intelligence

The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve sampling problems and non-convex optimization appearing in several machine learning applications. Especially, its variance reduced versions have nowadays gained particular attention. In this paper, we study two variants of this kind, namely, the Stochastic Variance Reduced Gradient Langevin Dynamics and the Stochastic Recursive Gradient Langevin Dynamics. We prove their convergence to the objective distribution in terms of KL-divergence under the sole assumptions of smoothness and Log-Sobolev inequality which are weaker conditions than those used in prior works for these algorithms. With the batch size and the inner loop length set to $\sqrt{n}$, the gradient complexity to achieve an $\epsilon$-precision is $\tilde{O}((n+dn^{1/2}\epsilon^{-1})\gamma^2 L^2\alpha^{-2})$, which is an improvement from any previous analyses. We also show some essential applications of our result to non-convex optimization.


Aggregated Gradient Langevin Dynamics

Zhang, Chao, Xie, Jiahao, Shen, Zebang, Zhao, Peilin, Zhou, Tengfei, Qian, Hui

arXiv.org Machine Learning

In this paper, we explore a general Aggregated Gradient Langevin Dynamics framework (AGLD) for the Markov Chain Monte Carlo (MCMC) sampling. We investigate the nonasymptotic convergence of AGLD with a unified analysis for different data accessing (e.g. random access, cyclic access and random reshuffle) and snapshot updating strategies, under convex and nonconvex settings respectively. It is the first time that bounds for I/O friendly strategies such as cyclic access and random reshuffle have been established in the MCMC literature. The theoretic results also indicate that methods in AGLD possess the merits of both the low per-iteration computational complexity and the short mixture time. Empirical studies demonstrate that our framework allows to derive novel schemes to generate high-quality samples for large-scale Bayesian posterior learning tasks.