Chandramoorthy, Nisha
When are dynamical systems learned from time series data statistically accurate?
Park, Jeongjin, Yang, Nicole, Chandramoorthy, Nisha
Conventional notions of generalization often fail to describe the ability of learned models to capture meaningful information from dynamical data. A neural network that learns complex dynamics with a small test error may still fail to reproduce its \emph{physical} behavior, including associated statistical moments and Lyapunov exponents. To address this gap, we propose an ergodic theoretic approach to generalization of complex dynamical models learned from time series data. Our main contribution is to define and analyze generalization of a broad suite of neural representations of classes of ergodic systems, including chaotic systems, in a way that captures emulating underlying invariant, physical measures. Our results provide theoretical justification for why regression methods for generators of dynamical systems (Neural ODEs) fail to generalize, and why their statistical accuracy improves upon adding Jacobian information during training. We verify our results on a number of ergodic chaotic systems and neural network parameterizations, including MLPs, ResNets, Fourier Neural layers, and RNNs.
Score Operator Newton transport
Chandramoorthy, Nisha, Schaefer, Florian, Marzouk, Youssef
Generating samples from a complex (e.g., non-Gaussian, high-dimensional) probability distribution is a core computational challenge in diverse applications, ranging from computational statistics and machine learning to molecular simulation. A recurring setting is where the density ρ of the target distribution is specified up to a normalizing constant--for example, in Bayesian modeling, where ρ represents the posterior density. Here, evaluations of the score log ρ are often available as well, even for complex statistical models [Villa et al., 2021]. Alternatively, many new methods enable effective score estimation from data, without explicit density estimation; examples include score estimation from time series observations in chaotic dynamical systems [Chandramoorthy and Wang, 2022, Ni, 2020] and score-based modeling of image distributions [Song et al., 2020b,a]. In these settings, transport or "flow"-driven algorithms for generating samples have seen extensive success. The central idea is to construct a transport map from a simple, prescribed source distribution to the target distribution of interest. One class of transport approaches, e.g., as represented by variational inference with normalizing flows, involves constructing a parametric class of invertible maps and minimizing some statistical divergence between the pushforward (see Section 2) of the source by a member of this class and the target. A different, essentially nonparametric, class of transport approaches are based on particle systems, e.g., Stein variational gradient descent (SVGD)