Structure-Preserving Transformers for Learning Parametrized Hamiltonian Systems

Brantner, Benedikt, de Romemont, Guillaume, Kraus, Michael, Li, Zeyuan

arXiv.org Artificial Intelligence 

This work addresses a problem in scientific machine learning [2] whose motivation comes from two trends and one observation: The first trend is using neural networks to identify dynamics of models for which data are available, but the underlying differential equation is either (i) not known or (ii) too expensive to solve. The first problem (i) often occurs when dealing with experimental data (see [8, 14]); the second one (ii) is crucial in reduced-order modeling (this will be elaborated on below). The second trend is a gradual replacement of hitherto established neural network architectures by transformer neural networks; the neural networks that are replaced are primarily recurrent neural networks such as long short-term memory networks (LSTMs, see [19]) that treat time series data, but also convolutional neural networks (CNNs) for image recognition (see [10]). The observation mentioned at the beginning of this section is the importance of including information about the physical system into a machine learning model. In this paper the physical property we consider is symplecticity (see [1, 3, 16, 27]).