The promises and pitfalls of Stochastic Gradient Langevin Dynamics

Brosse, Nicolas, Durmus, Alain, Moulines, Eric

Feb-14-2020, 20:12:49 GMT–Neural Information Processing Systems

Stochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorithm for Bayesian learning from large scale datasets. While SGLD with decreasing step sizes converges weakly to the posterior distribution, the algorithm is often used with a constant step size in practice and has demonstrated spectacular successes in machine learning tasks. The current practice is to set the step size inversely proportional to N where N is the number of training samples. As N becomes large, we show that the SGLD algorithm has an invariant probability measure which significantly departs from the target posterior and behaves like as Stochastic Gradient Descent (SGD). This difference is inherently due to the high variance of the stochastic gradients.

algorithm, promise and pitfall, stochastic gradient langevin dynamic, (3 more...)

Neural Information Processing Systems

Feb-14-2020, 20:12:49 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Mathematical & Statistical Methods (1.00)
  - Machine Learning > Statistical Learning
    - Gradient Descent (1.00)