AITopics | stochastic gradient langevin dynamic algorithm

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Neural Information Processing SystemsDec-24-2025, 11:28:11 GMT

We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm (SGLD), so-called contour stochastic gradient Langevin dynamics (CSGLD), for Bayesian learning in big data statistics. The proposed algorithm is essentially a scalable dynamic importance sampler, which automatically flattens the target distribution such that the simulation for a multi-modal distribution can be greatly facilitated. Theoretically, we prove a stability condition and establish the asymptotic convergence of the self-adapting parameter to a unique fixed-point, regardless of the non-convexity of the original energy function; we also present an error analysis for the weighted averaging estimators. Empirically, the CSGLD algorithm is tested on multiple benchmark datasets including CIFAR10 and CIFAR100. The numerical results indicate its superiority over the existing state-of-the-art algorithms in training deep neural networks.

multi-modal distribution, name change, stochastic gradient langevin dynamic algorithm, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Review for NeurIPS paper: A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Neural Information Processing SystemsJan-27-2025, 18:59:37 GMT

My main concern is that using a flattened surrogate energy in this fashion is suitable for most sampling situations. The main reason is, by construction our iterates are not following the true distribution particularly closely; for example a plot of the samples obtained in the synthetic experiments (figs 2c--d) would look quite different from the original. While this does allow the algorithm to bounce out of local optima, the deviance from the true energy would make samples obtained after convergence to not be super useful. For point estimation situations, we might be able to get away with these samples for cases where the multiple modes of the real energy are sort of symmetric (as in the synthetic Gaussian experiments); it seems that even if we use a'flattened' energy (can be thought of as lower peaks with higher elevation between them), the original distribution's symmetry would be essentially preserved and the mean / other point estimates would be close enough. But flattening energies with skewed distribution of modes might not be as accurate, as the flattened version might have a mean closer to the'center' of the space, but the original would be closer to one of the modes near the periphery (am visualizing a simple 2-d space).

multi-modal distribution, neurips paper, stochastic gradient langevin dynamic algorithm, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

Review for NeurIPS paper: A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Neural Information Processing SystemsJan-27-2025, 18:59:31 GMT

The paper presents valuable theoretical and empirical evidence for a novel algorithm. The AC is confident this represents valuable work but was a bit torn about the acceptance decision, as the reviewers point out several important avenues where improvement in the paper is needed.

multi-modal distribution, neurips paper, stochastic gradient langevin dynamic algorithm, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Neural Information Processing SystemsOct-11-2024, 04:08:34 GMT

We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm (SGLD), so-called contour stochastic gradient Langevin dynamics (CSGLD), for Bayesian learning in big data statistics. The proposed algorithm is essentially a scalable dynamic importance sampler, which automatically flattens the target distribution such that the simulation for a multi-modal distribution can be greatly facilitated. Theoretically, we prove a stability condition and establish the asymptotic convergence of the self-adapting parameter to a unique fixed-point, regardless of the non-convexity of the original energy function; we also present an error analysis for the weighted averaging estimators. Empirically, the CSGLD algorithm is tested on multiple benchmark datasets including CIFAR10 and CIFAR100. The numerical results indicate its superiority over the existing state-of-the-art algorithms in training deep neural networks.

multi-modal distribution, simulation, stochastic gradient langevin dynamic algorithm

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection

Song, Qifan, Sun, Yan, Ye, Mao, Liang, Faming

arXiv.org Machine LearningFeb-7-2020

Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed dimension and the log-posterior density is differentiable with respect to the parameters. This paper proposes an extended stochastic gradient MCMC lgoriathm which, by introducing appropriate latent variables, can be applied to more general large-scale Bayesian computing problems, such as those involving dimension jumping and missing data. Numerical studies show that the proposed algorithm is highly scalable and much more efficient than traditional MCMC algorithms. The proposed algorithms have much alleviated the pain of Bayesian methods in big data computing.

algorithm, iteration, stochastic gradient langevin, (10 more...)

arXiv.org Machine Learning

2002.02919

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Research Report (0.82)

Add feedback

Filters

Collaborating Authors

stochastic gradient langevin dynamic algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Review for NeurIPS paper: A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Review for NeurIPS paper: A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection