Mean-Field Neural ODEs via Relaxed Optimal Control

Jabir, Jean-François, Šiška, David, Szpruch, Łukasz

arXiv.org Machine Learning 

We develop a framework for the analysis of deep neural networks and neural ODE models that are trained with stochastic gradient algorithms. We do that by identifying the connections between high-dimensional data-driven control problems, deep learning and theory of statistical sampling. In particular, we derive and study a mean-field (over-damped) Langevin algorithm for solving relaxed data-driven control problems. A key step in the analysis is to derive Pontryagin's optimality principle for data-driven relaxed control problems. Subsequently, we study uniform-in-time propagation of chaos of time-discretised Mean-Field (overdamped) Langevin dynamics. We derive explicit convergence rate in terms of the learning rate, the number of particles/model parameters and the number of iterations of the gradient algorithm. In addition, we study the error arising when using a finite training data set and thus provide quantitive bounds on the generalisation error. Crucially, the obtained rates are dimension-independent. This is possible by exploiting the regularity of the model with respect to the measure over the parameter space (relaxed control).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found