Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Neural Information Processing Systems 

Optimization is a key component for training machine learning models and has a strong impact on their generalization. In this paper, we consider a particular optimization method--the stochastic gradient Langevin dynamics (SGLD) algorithm--and investigate the generalization of models trained by SGLD.