Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data
Kumar, Dibyakanti, Jha, Samyak, Mukherjee, Anirbit
–arXiv.org Artificial Intelligence
In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this via showing that under Total Variation distance and q-Renyi divergence, the iterates of Langevin Monte Carlo converge to the Gibbs distribution of Frobenius norm regularized losses for any of these nets, when using smooth activations and in both classification and regression settings. Most critically, the amount of regularization needed for our results is independent of the size of the net. This result combines several recent observations, like our previous papers showing that two-layer neural loss functions can always be regularized by a certain constant amount such that they satisfy the Villani conditions, and thus their Gibbs measures satisfy a Poincare inequality.
arXiv.org Artificial Intelligence
Mar-20-2025