Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Zhang, Ruqi, Li, Chunyuan, Zhang, Jianyi, Chen, Changyou, Wilson, Andrew Gordon
The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove that our proposed learning rate schedule provides faster convergence to samples from a stationary distribution than SG-MCMC with standard decaying schedules. Moreover, we provide extensive experimental results to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.
Feb-11-2019
- Country:
- Asia > Afghanistan
- Parwan Province > Charikar (0.04)
- Europe > France
- Occitanie > Haute-Garonne > Toulouse (0.04)
- North America > United States
- New York (0.04)
- Asia > Afghanistan
- Genre:
- Research Report (0.82)