Goto

Collaborating Authors

 vpnn


Locally-symplectic neural networks for learning volume-preserving dynamics

arXiv.org Artificial Intelligence

We propose locally-symplectic neural networks LocSympNets for learning the flow of phase volume-preserving dynamics. The construction of LocSympNets stems from the theorem of the local Hamiltonian description of the divergence-free vector field and the splitting methods based on symplectic integrators. Symplectic gradient modules of the recently proposed symplecticity-preserving neural networks SympNets are used to construct invertible locally-symplectic modules. To further preserve properties of the flow of a dynamical system LocSympNets are extended to symmetric locally-symplectic neural networks SymLocSympNets, such that the inverse of SymLocSympNets is equal to the feed-forward propagation of SymLocSympNets with the negative time step, which is a general property of the flow of a dynamical system. LocSympNets and SymLocSympNets are studied numerically considering learning linear and nonlinear volume-preserving dynamics. We demonstrate learning of linear traveling wave solutions to the semi-discretized advection equation, periodic trajectories of the Euler equations of the motion of a free rigid body, and quasi-periodic solutions of the charged particle motion in an electromagnetic field. LocSympNets and SymLocSympNets can learn linear and nonlinear dynamics to a high degree of accuracy even when random noise is added to the training data. When learning a single trajectory of the rigid body dynamics locally-symplectic neural networks can learn both quadratic invariants of the system with absolute relative errors below 1%. In addition, SymLocSympNets produce qualitatively good long-time predictions, when the learning of the whole system from randomly sampled data is considered. LocSympNets and SymLocSympNets can produce accurate short-time predictions of quasi-periodic solutions, which is illustrated in the example of the charged particle motion in an electromagnetic field.


Volume-preserving Neural Networks: A Solution to the Vanishing Gradient Problem

arXiv.org Machine Learning

Department of Mathematics and Statistics McGill University Montreal, QC H3A 0E9 Canada Editor: Abstract We propose a novel approach to addressing the vanishing (or exploding) gradient problem in deep neural networks. We construct a new architecture for deep neural networks where all layers (except the output layer) of the network are a combination of rotation, permutation, diagonal, and activation sublayers which are all volume preserving. This control on the volume forces the gradient (on average) to maintain equilibrium and not explode or vanish. Volume-preserving neural networks train reliably, quickly and accurately and the learning rate is consistent across layers in deep volume-preserving neural networks. To demonstrate this we apply our volume-preserving neural network model to two standard datasets. Keywords: volume-preserving, neural network, machine learning, deep learning, vanishing gradient problem 1. Introduction Deep neural networks are characterized by the composition of a large number of functions (aka layers), each typically consisting of an affine transformation followed by a non-affine "activation function". Each layer is determined by a number of parameters which are trained on data to approximate some function. The deepness refers to the number of such functions composed (or the number of layers). The number of layers required to be deep is not well-defined, but an overview of deep learning (Schmidhuber, 2015) states that any 1 arXiv:1911.09576v2