Port-Hamiltonian Approach to Neural Network Training

Massaroli, Stefano, Poli, Michael, Califano, Federico, Faragasso, Angela, Park, Jinkyoo, Yamashita, Atsushi, Asama, Hajime

arXiv.org Machine Learning 

-- Neural networks are discrete entities: subdivided into discrete layers and parametrized by weights which are iteratively optimized via difference equations. Recent work proposes networks with layer outputs which are no longer quantized but are solutions of an ordinary differential equation (ODE); however, these networks are still optimized via discrete methods (e.g. In this paper, we explore a different direction: namely, we propose a novel framework for learning in which the parameters themselves are solutions of ODEs. By viewing the optimization process as the evolution of a port-Hamiltonian system, we can ensure convergence to a minimum of the objective function. Numerical experiments have been performed to show the validity and effectiveness of the proposed methods. Neural networks are universal function approximators [1]. Given enough capacity, which can arbitrarily be increased by adding more parameters to the model, they can approximate any Borel-measurable function mapping finite-dimensional spaces. Each layer of a neural network performs an affine transformation to its input and generates an output which is then fed into the next layer. Backpropagation [2] is at the core of modern deep learning, and most state-of-the-art architectures for tasks such as image segmentation [3], generative tasks [4], image classification [5] and machine translation [6] rely on the effective combination of universal approximators and line search optimization methods: most notably stochastic gradient descent (SGD), Adam [7] RM-SProp [8] and recently RAdam [9].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found