Mean-Field Neural ODEs via Relaxed Optimal Control
Jabir, Jean-François, Šiška, David, Szpruch, Łukasz
We develop a framework for the analysis of deep neural networks and neural ODE models that are trained with stochastic gradient algorithms. We do that by identifying the connections between high-dimensional data-driven control problems, deep learning and theory of statistical sampling. In particular, we derive and study a mean-field (over-damped) Langevin algorithm for solving relaxed data-driven control problems. A key step in the analysis is to derive Pontryagin's optimality principle for data-driven relaxed control problems. Subsequently, we study uniform-in-time propagation of chaos of time-discretised Mean-Field (overdamped) Langevin dynamics. We derive explicit convergence rate in terms of the learning rate, the number of particles/model parameters and the number of iterations of the gradient algorithm. In addition, we study the error arising when using a finite training data set and thus provide quantitive bounds on the generalisation error. Crucially, the obtained rates are dimension-independent. This is possible by exploiting the regularity of the model with respect to the measure over the parameter space (relaxed control).
Dec-11-2019
- Country:
- North America > United States
- Massachusetts > Middlesex County > Belmont (0.04)
- Europe
- Russia (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Cambridgeshire > Cambridge (0.04)
- Asia
- Russia (0.04)
- Middle East > Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.63)
- Technology: