Stochastic Gradient Langevin Dynamics with Variance Reduction

Huang, Zhishen, Becker, Stephen

arXiv.org Artificial Intelligence 

--Stochastic gradient Langevin dynamics (SGLD) has gained the attention of optimization researchers due to its global optimization properties. This paper proves an improved convergence property to local minimizers of nonconvex objective functions using SGLD accelerated by variance reductions. Moreover, we prove an ergodicity property of the SGLD scheme, which gives insights on its potential to find global minimizers of nonconvex objectives. In this paper we consider the optimization algorithm stochastic gradient descent (SGD) with variance reduction (VR) and Gaussian noise injected at every iteration step. For historical reasons, the particular randomization format of injecting Gaussian noises bears the name Langevin dynamics (LD). Thus, the scheme we consider is referred as stochastic gradient Langevin dynamics with variance reduction (SGLD-VR). We prove the ergodicity property of SGLD-VR schemes when used as an optimization algorithm, which the normal SGD method without the additional noise does not have. As the ergodicity property implies the non-trivial probability for the LD process to visit the whole space, the set of global minima will also be traversed during the iteration. We also provide convergence results of SGLD-VR to local minima in a similar style to [Xu et al., 2018].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found