rk method
Learning by solving differential equations
Dherin, Benoit, Munn, Michael, Mazzawi, Hanna, Wunder, Michael, Medapati, Sourabh, Gonzalvo, Javier
Modern deep learning algorithms use variations of gradient descent as their main learning methods. Gradient descent can be understood as the simplest Ordinary Differential Equation (ODE) solver; namely, the Euler method applied to the gradient flow differential equation. Since Euler, many ODE solvers have been devised that follow the gradient flow equation more precisely and more stably. Runge-Kutta (RK) methods provide a family of very powerful explicit and implicit high-order ODE solvers. However, these higher-order solvers have not found wide application in deep learning so far. In this work, we evaluate the performance of higher-order RK solvers when applied in deep learning, study their limitations, and propose ways to overcome these drawbacks. In particular, we explore how to improve their performance by naturally incorporating key ingredients of modern neural network optimizers such as preconditioning, adaptive learning rates, and momentum.
Probabilistic ODE Solvers with Runge-Kutta Means
Michael Schober, David K. Duvenaud, Philipp Hennig
Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Probabilistic ODE Solvers with Runge-Kutta Means
Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Designing Stable Neural Networks using Convex Analysis and ODEs
Sherry, Ferdia, Celledoni, Elena, Ehrhardt, Matthias J., Murari, Davide, Owren, Brynjulf, Schönlieb, Carola-Bibiane
Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrained, has a Lipschitz constant that, in the worst case, grows exponentially with the depth of the network. Further analysis of the proposed architecture shows that the spectral norms of the weights can be further constrained to ensure that the network is an averaged operator, making it a natural candidate for a learned denoiser in Plug-and-Play algorithms. Using a novel adaptive way of enforcing the spectral norm constraints, we show that, even with these constraints, it is possible to train performant networks. The proposed architecture is applied to the problem of adversarially robust image classification, to image denoising, and finally to the inverse problem of deblurring.
- North America > Canada > Ontario > Toronto (0.14)
- North America > Mexico > Gulf of Mexico (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (7 more...)
Convolutional Neural Networks combined with Runge-Kutta Methods
Zhu, Mai, Chang, Bo, Fu, Chong
A convolutional neural network can be constructed using numerical methods for solving dynamical systems, since the forward pass of the network can be regarded as a trajectory of a dynamical system. However, existing models based on numerical solvers cannot avoid the iterations of implicit methods, which makes the models inefficient at inference time. In this paper, we reinterpret the pre-activation Residual Networks (ResNets) and their variants from the dynamical systems view. We consider that the iterations of implicit Runge-Kutta methods are fused into the training of these models. Moreover, we propose a novel approach to constructing network models based on high-order Runge-Kutta methods in order to achieve higher efficiency. Our proposed models are referred to as the Runge-Kutta Convolutional Neural Networks (RKCNNs). The RKCNNs are evaluated on multiple benchmark datasets. The experimental results show that RKCNNs are vastly superior to other dynamical system network models: they achieve higher accuracy with much fewer resources. They also expand the family of network models based on numerical methods for dynamical systems.
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (5 more...)
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.48)
Probabilistic ODE Solvers with Runge-Kutta Means
Schober, Michael, Duvenaud, David K., Hennig, Philipp
Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Probabilistic ODE Solvers with Runge-Kutta Means
Schober, Michael, Duvenaud, David, Hennig, Philipp
Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)