ode-net
Learning continuous models for continuous physics
Krishnapriyan, Aditi S., Queiruga, Alejandro F., Erichson, N. Benjamin, Mahoney, Michael W.
Dynamical systems that evolve continuously over time are ubiquitous throughout science and engineering. Machine learning (ML) provides data-driven approaches to model and predict the dynamics of such systems. A core issue with this approach is that ML models are typically trained on discrete data, using ML methodologies that are not aware of underlying continuity properties. This results in models that often do not capture any underlying continuous dynamics -- either of the system of interest, or indeed of any related system. To address this challenge, we develop a convergence test based on numerical analysis theory. Our test verifies whether a model has learned a function that accurately approximates an underlying continuous dynamics. Models that fail this test fail to capture relevant dynamics, rendering them of limited utility for many scientific prediction tasks; while models that pass this test enable both better interpolation and better extrapolation in multiple ways. Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
Variational formulations of ODE-Net as a mean-field optimal control problem and existence results
Isobe, Noboru, Okumura, Mizuho
This paper presents a mathematical analysis of ODE-Net, a continuum model of deep neural networks (DNNs). In recent years, Machine Learning researchers have introduced ideas of replacing the deep structure of DNNs with ODEs as a continuum limit. These studies regard the "learning" of ODE-Net as the minimization of a "loss" constrained by a parametric ODE. Although the existence of a minimizer for this minimization problem needs to be assumed, only a few studies have investigated its existence analytically in detail. In the present paper, the existence of a minimizer is discussed based on a formulation of ODE-Net as a measure-theoretic mean-field optimal control problem. The existence result is proved when a neural network, which describes a vector field of ODE-Net, is linear with respect to learnable parameters. The proof employs the measure-theoretic formulation combined with the direct method of Calculus of Variations. Secondly, an idealized minimization problem is proposed to remove the above linearity assumption. Such a problem is inspired by a kinetic regularization associated with the Benamou--Brenier formula and universal approximation theorems for neural networks. The proofs of these existence results use variational methods, differential equations, and mean-field optimal control theory. They will stand for a new analytic way to investigate the learning process of deep neural networks.
- Europe > United Kingdom > North Sea > Southern North Sea (0.05)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (7 more...)
- Research Report (0.64)
- Overview (0.46)
Implementation and (Inverse Modified) Error Analysis for implicitly-templated ODE-nets
Zhu, Aiqing, Bertalan, Tom, Zhu, Beibei, Tang, Yifa, Kevrekidis, Ioannis G.
We focus on learning unknown dynamics from data using ODE-nets templated on implicit numerical initial value problem solvers. First, we perform Inverse Modified error analysis of the ODE-nets using unrolled implicit schemes for ease of interpretation. It is shown that training an ODE-net using an unrolled implicit scheme returns a close approximation of an Inverse Modified Differential Equation (IMDE). In addition, we establish a theoretical basis for hyper-parameter selection when training such ODE-nets, whereas current strategies usually treat numerical integration of ODE-nets as a black box. We thus formulate an adaptive algorithm which monitors the level of error and adapts the number of (unrolled) implicit solution iterations during the training process, so that the error of the unrolled approximation is less than the current learning loss. This helps accelerate training, while maintaining accuracy. Several numerical experiments are performed to demonstrate the advantages of the proposed algorithm compared to nonadaptive unrollings, and validate the theoretical analysis. We also note that this approach naturally allows for incorporating partially known physical terms in the equations, giving rise to what is termed ``gray box" identification.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (4 more...)
ODEs learn to walk: ODE-Net based data-driven modeling for crowd dynamics
Predicting the behaviors of pedestrian crowds is of critical importance for a variety of real-world problems. Data driven modeling, which aims to learn the mathematical models from observed data, is a promising tool to construct models that can make accurate predictions of such systems. In this work, we present a data-driven modeling approach based on the ODE-Net framework, for constructing continuous-time models of crowd dynamics. We discuss some challenging issues in applying the ODE-Net method to such problems, which are primarily associated with the dimensionality of the underlying crowd system, and we propose to address these issues by incorporating the social-force concept in the ODE-Net framework. Finally application examples are provided to demonstrate the performance of the proposed method.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom (0.04)
Convolutional Neural Networks combined with Runge-Kutta Methods
Zhu, Mai, Chang, Bo, Fu, Chong
A convolutional neural network can be constructed using numerical methods for solving dynamical systems, since the forward pass of the network can be regarded as a trajectory of a dynamical system. However, existing models based on numerical solvers cannot avoid the iterations of implicit methods, which makes the models inefficient at inference time. In this paper, we reinterpret the pre-activation Residual Networks (ResNets) and their variants from the dynamical systems view. We consider that the iterations of implicit Runge-Kutta methods are fused into the training of these models. Moreover, we propose a novel approach to constructing network models based on high-order Runge-Kutta methods in order to achieve higher efficiency. Our proposed models are referred to as the Runge-Kutta Convolutional Neural Networks (RKCNNs). The RKCNNs are evaluated on multiple benchmark datasets. The experimental results show that RKCNNs are vastly superior to other dynamical system network models: they achieve higher accuracy with much fewer resources. They also expand the family of network models based on numerical methods for dynamical systems.
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (5 more...)
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.48)
Continuous-in-Depth Neural Networks
Queiruga, Alejandro F., Erichson, N. Benjamin, Taylor, Dane, Mahoney, Michael W.
Recent work has attempted to interpret residual networks (ResNets) as one step of a forward Euler discretization of an ordinary differential equation, focusing mainly on syntactic algebraic similarities between the two systems. Discrete dynamical integrators of continuous dynamical systems, however, have a much richer structure. We first show that ResNets fail to be meaningful dynamical integrators in this richer sense. We then demonstrate that neural network models can learn to represent continuous dynamical systems, with this richer structure and properties, by embedding them into higher-order numerical integration schemes, such as the Runge Kutta schemes. Based on these insights, we introduce ContinuousNet as a continuous-in-depth generalization of ResNet architectures. ContinuousNets exhibit an invariance to the particular computational graph manifestation. That is, the continuous-in-depth model can be evaluated with different discrete time step sizes, which changes the number of layers, and different numerical integration schemes, which changes the graph connectivity. We show that this can be used to develop an incremental-in-depth training scheme that improves model quality, while significantly decreasing training time. We also show that, once trained, the number of units in the computational graph can even be decreased, for faster inference with little-to-no accuracy drop.
- North America > United States (0.28)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Asia > Middle East > Jordan (0.04)
- Energy (0.67)
- Information Technology (0.46)
- Government > Regional Government (0.45)