AITopics | quantitative propagation

Collaborating Authors

quantitative propagation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Quantitative Propagation of Chaos for SGD in Wide Neural Networks

Neural Information Processing SystemsDec-23-2025, 16:59:31 GMT

In this paper, we investigate the limiting behavior of a continuous-time counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural networks, as the number or neurons (i.e., the size of the hidden layer) $N \to \plusinfty$. Following a probabilistic approach, we show `propagation of chaos' for the particle system defined by this continuous-time dynamics under different scenarios, indicating that the statistical interaction between the particles asymptotically vanishes. In particular, we establish quantitative convergence with respect to $N$ of any particle to a solution of a mean-field McKean-Vlasov equation in the metric space endowed with the Wasserstein distance. In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations. We then identify two regimes under which different mean-field limits are obtained, one of them corresponding to an implicitly regularized version of the minimization problem at hand. We perform various experiments on real datasets to validate our theoretical results, assessing the existence of these two regimes on classification problems and illustrating our convergence results.

name change, quantitative propagation, wide neural network, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

Quantitative Propagation of Chaos for SGD in Wide Neural Networks S

Neural Information Processing SystemsOct-1-2025, 22:03:36 GMT

Mean field approximation and propagation of chaos for mSGLD . . . . . . . . . . 4 S3 T echnical results 4 S4 Quantitative propagation of chaos 8 S4.1 Existence of strong solutions to the particle SDE . . . . . . . . . . . . . . . . . . If F = R, then we simply note C( E). S2.1 Presentation of the modified SGLD and its continuous counterpart The proof is postponed to Section S4.4 Consider now the mean-field SDE starting from a random variable W The proof is postponed to Section S4.4 Then, there exists L 0 such that the following hold. In what follows, we bound separately the two terms in the right-hand side.

artificial intelligence, machine learning, strong solution, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: Quantitative Propagation of Chaos for SGD in Wide Neural Networks

Neural Information Processing SystemsJan-21-2025, 03:40:50 GMT

Additional Feedback: I have a few minor comments. Specifically: (1a) Depending on how one thinks about it, the learning rate in previous papers on infinite width SGD depends on the number of hidden units. What I mean is that you explicitly put in the 1/N as the size of the weights into the last layer. This has the effect of putting in a 1/N in the derivative d Loss / d W, where W is a weight in the first layer, which is akin to putting an extra 1/N into the learning rate. In previous papers (NTK-type analyses in deeper networks), sometimes this scale of weights is like N {-1/2}.

neurips paper, quantitative propagation, wide neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Review for NeurIPS paper: Quantitative Propagation of Chaos for SGD in Wide Neural Networks

Neural Information Processing SystemsJan-21-2025, 03:40:42 GMT

In a nutshell the authors bound the distance between SGD iterates and that of the mean field dynamic. Using this theory the authors also study the effect of the scaling of the step size. All reviewers thought the paper was interesting, in particular the regime change result. Reviewer 2 had some concerns about the strictness of the assumptions which were mitigated based on the authors' response. I concur with the positive reviews and recommend the paper to be accepted.

neurips paper, quantitative propagation, wide neural network, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Quantitative Propagation of Chaos for SGD in Wide Neural Networks

Neural Information Processing SystemsOct-9-2024, 09:52:07 GMT

In this paper, we investigate the limiting behavior of a continuous-time counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural networks, as the number or neurons (i.e., the size of the hidden layer) N \to \plusinfty . Following a probabilistic approach, we show propagation of chaos' for the particle system defined by this continuous-time dynamics under different scenarios, indicating that the statistical interaction between the particles asymptotically vanishes. In particular, we establish quantitative convergence with respect to N of any particle to a solution of a mean-field McKean-Vlasov equation in the metric space endowed with the Wasserstein distance. In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations. We then identify two regimes under which different mean-field limits are obtained, one of them corresponding to an implicitly regularized version of the minimization problem at hand.

chaos, quantitative propagation, wide neural network, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)

Add feedback