Convergence of mean-field Langevin dynamics: time-space discretization, stochastic gradient, and variance reduction

Oct-11-2024, 00:55:37 GMT–Neural Information Processing Systems

The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift, and it naturally arises from the optimization of two-layer neural networks via (noisy) gradient descent. Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures. However, all prior analyses assumed the infinite-particle or continuous-time limit, and cannot handle stochastic gradient updates. We provide a general framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and stochastic gradient. To demonstrate the wide applicability of our framework, we establish quantitative convergence rate guarantees to the regularized global optimal solution for (i) a wide range of learning problems such as mean-field neural network and MMD minimization, and (ii) different gradient estimators including SGD and SVRG.

langevin dynamic, mean-field langevin dynamic, stochastic gradient, (6 more...)

Neural Information Processing Systems

Oct-11-2024, 00:55:37 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)