Goto

Collaborating Authors

 dynamical system


Generalization at the Edge of Stability

Tuci, Mario, Korkmaz, Caner, Şimşekli, Umut, Birdal, Tolga

arXiv.org Machine Learning

Training modern neural networks often relies on large learning rates, operating at the edge of stability, where the optimization dynamics exhibit oscillatory and chaotic behavior. Empirically, this regime often yields improved generalization performance, yet the underlying mechanism remains poorly understood. In this work, we represent stochastic optimizers as random dynamical systems, which often converge to a fractal attractor set (rather than a point) with a smaller intrinsic dimension. Building on this connection and inspired by Lyapunov dimension theory, we introduce a novel notion of dimension, coined the `sharpness dimension', and prove a generalization bound based on this dimension. Our results show that generalization in the chaotic regime depends on the complete Hessian spectrum and the structure of its partial determinants, highlighting a complexity that cannot be captured by the trace or spectral norm considered in prior work. Experiments across various MLPs and transformers validate our theory while also providing new insights into the recently observed phenomenon of grokking.


On the Unique Recovery of Transport Maps and Vector Fields from Finite Measure-Valued Data

Botvinick-Greenhouse, Jonah, Yang, Yunan

arXiv.org Machine Learning

We establish guarantees for the unique recovery of vector fields and transport maps from finite measure-valued data, yielding new insights into generative models, data-driven dynamical systems, and PDE inverse problems. In particular, we provide general conditions under which a diffeomorphism can be uniquely identified from its pushforward action on finitely many densities, i.e., when the data $\{(ρ_j,f_\#ρ_j)\}_{j=1}^m$ uniquely determines $f$. As a corollary, we introduce a new metric which compares diffeomorphisms by measuring the discrepancy between finitely many pushforward densities in the space of probability measures. We also prove analogous results in an infinitesimal setting, where derivatives of the densities along a smooth vector field are observed, i.e., when $\{(ρ_j,\text{div} (ρ_j v))\}_{j=1}^m$ uniquely determines $v$. Our analysis makes use of the Whitney and Takens embedding theorems, which provide estimates on the required number of densities $m$, depending only on the intrinsic dimension of the problem. We additionally interpret our results through the lens of Perron--Frobenius and Koopman operators and demonstrate how our techniques lead to new guarantees for the well-posedness of certain PDE inverse problems related to continuity, advection, Fokker--Planck, and advection-diffusion-reaction equations. Finally, we present illustrative numerical experiments demonstrating the unique identification of transport maps from finitely many pushforward densities, and of vector fields from finitely many weighted divergence observations.


Optimal Centered Active Excitation in Linear System Identification

Ito, Kaito, Proutiere, Alexandre

arXiv.org Machine Learning

We propose an active learning algorithm for linear system identification with optimal centered noise excitation. Notably, our algorithm, based on ordinary least squares and semidefinite programming, attains the minimal sample complexity while allowing for efficient computation of an estimate of a system matrix. More specifically, we first establish lower bounds of the sample complexity for any active learning algorithm to attain the prescribed accuracy and confidence levels. Next, we derive a sample complexity upper bound of the proposed algorithm, which matches the lower bound for any algorithm up to universal factors. Our tight bounds are easy to interpret and explicitly show their dependence on the system parameters such as the state dimension.


Operator Learning for Smoothing and Forecasting

Calvello, Edoardo, Carlson, Elizabeth, Kovachki, Nikola, Manta, Michael N., Stuart, Andrew M.

arXiv.org Machine Learning

Machine learning has opened new frontiers in purely data-driven algorithms for data assimilation in, and for forecasting of, dynamical systems; the resulting methods are showing some promise. However, in contrast to model-driven algorithms, analysis of these data-driven methods is poorly developed. In this paper we address this issue, developing a theory to underpin data-driven methods to solve smoothing problems arising in data assimilation and forecasting problems. The theoretical framework relies on two key components: (i) establishing the existence of the mapping to be learned; (ii) the properties of the operator learning architecture used to approximate this mapping. By studying these two components in conjunction, we establish novel universal approximation theorems for purely data driven algorithms for both smoothing and forecasting of dynamical systems. We work in the continuous time setting, hence deploying neural operator architectures. The theoretical results are illustrated with experiments studying the Lorenz `63, Lorenz `96 and Kuramoto-Sivashinsky dynamical systems.


Closed-form conditional diffusion models for data assimilation

Binder, Brianna, Dasgupta, Agnimitra, Oberai, Assad

arXiv.org Machine Learning

We propose closed-form conditional diffusion models for data assimilation. Diffusion models use data to learn the score function (defined as the gradient of the log-probability density of a data distribution), allowing them to generate new samples from the data distribution by reversing a noise injection process. While it is common to train neural networks to approximate the score function, we leverage the analytical tractability of the score function to assimilate the states of a system with measurements. To enable the efficient evaluation of the score function, we use kernel density estimation to model the joint distribution of the states and their corresponding measurements. The proposed approach also inherits the capability of conditional diffusion models of operating in black-box settings, i.e., the proposed data assimilation approach can accommodate systems and measurement processes without their explicit knowledge. The ability to accommodate black-box systems combined with the superior capabilities of diffusion models in approximating complex, non-Gaussian probability distributions means that the proposed approach offers advantages over many widely used filtering methods. We evaluate the proposed method on nonlinear data assimilation problems based on the Lorenz-63 and Lorenz-96 systems of moderate dimensionality and nonlinear measurement models. Results show the proposed approach outperforms the widely used ensemble Kalman and particle filters when small to moderate ensemble sizes are used.


Topological Detection of Hopf Bifurcations via Persistent Homology: A Functional Criterion from Time Series

Barrios, Jhonathan, Echávez, Yásser, Álvarez, Carlos F.

arXiv.org Machine Learning

We propose a topological framework for the detection of Hopf bifurcations directly from time series, based on persistent homology applied to phase space reconstructions via Takens embedding within the framework of Topological Data Analysis. The central idea is that changes in the dynamical regime are reflected in the emergence or disappearance of a dominant one-dimensional homological features in the reconstructed attractor. To quantify this behavior, we introduce a simple and interpretable scalar topological functional defined as the maximum persistence of homology classes in dimension one. This functional is used to construct a computable criterion for identifying critical parameters in families of dynamical systems without requiring knowledge of the underlying equations. The proposed approach is validated on representative systems of increasing complexity, showing consistent detection of the bifurcation point. The results support the interpretation of dynamical transitions as topological phase transitions and demonstrate the potential of topological data analysis as a model-free tool for the quantitative analysis of nonlinear time series.


A Distribution-to-Distribution Neural Probabilistic Forecasting Framework for Dynamical Systems

Yang, Tianlin, Du, Hailiang, Aslett, Louis

arXiv.org Machine Learning

Probabilistic forecasting provides a principled framework for uncertainty quantification in dynamical systems by representing predictions as probability distributions rather than deterministic trajectories. However, existing forecasting approaches, whether physics-based or neural-network-based, remain fundamentally trajectory-oriented: predictive distributions are usually accessed through ensembles or sampling, rather than evolved directly as dynamical objects. A distribution-to-distribution (D2D) neural probabilistic forecasting framework is developed to operate directly on predictive distributions. The framework introduces a distributional encoding and decoding structure around a replaceable neural forecasting module, using kernel mean embeddings to represent input distributions and mixture density networks to parameterise output predictive distributions. This design enables recursive propagation of predictive uncertainty within a unified end-to-end neural architecture, with model training and evaluation carried out directly in terms of probabilistic forecast skill. The framework is demonstrated on the Lorenz63 chaotic dynamical system. Results show that the D2D model captures nontrivial distributional evolution under nonlinear dynamics, produces skillful probabilistic forecasts without explicit ensemble simulation, and remains competitive with, and in some cases outperforms, a simplified perfect model benchmark. These findings point to a new paradigm for probabilistic forecasting, in which predictive distributions are learned and evolved directly rather than reconstructed indirectly through ensemble-based uncertainty propagation.


Deep Adaptive Model-Based Design of Experiments

Strouwen, Arno, Micluţa-Câmpeanu, Sebastian

arXiv.org Machine Learning

Model-based design of experiments (MBDOE) is essential for efficient parameter estimation in nonlinear dynamical systems. However, conventional adaptive MBDOE requires costly posterior inference and design optimization between each experimental step, precluding real-time applications. We address this by combining Deep Adaptive Design (DAD), which amortizes sequential design into a neural network policy trained offline, with differentiable mechanistic models. For dynamical systems with known governing equations but uncertain parameters, we extend sequential contrastive training objectives to handle nuisance parameters and propose a transformer-based policy architecture that respects the temporal structure of dynamical systems. We demonstrate the approach on four systems of increasing complexity: a fed-batch bioreactor with Monod kinetics, a Haldane bioreactor with uncertain substrate inhibition, a two-compartment pharmacokinetic model with nuisance clearance parameters, and a DC motor for real-time deployment.


CoNBONet: Conformalized Neuroscience-inspired Bayesian Operator Network for Reliability Analysis

Garg, Shailesh, Chakraborty, Souvik

arXiv.org Machine Learning

Time-dependent reliability analysis of nonlinear dynamical systems under stochastic excitations is a critical yet computationally demanding task. Conventional approaches, such as Monte Carlo simulation, necessitate repeated evaluations of computationally expensive numerical solvers, leading to significant computational bottlenecks. To address this challenge, we propose \textit{CoNBONet}, a neuroscience-inspired surrogate model that enables fast, energy-efficient, and uncertainty-aware reliability analysis, providing a scalable alternative to techniques such as Monte Carlo simulations. CoNBONet, short for \textbf{Co}nformalized \textbf{N}euroscience-inspired \textbf{B}ayesian \textbf{O}perator \textbf{Net}work, leverages the expressive power of deep operator networks while integrating neuroscience-inspired neuron models to achieve fast, low-power inference. Unlike traditional surrogates such as Gaussian processes, polynomial chaos expansions, or support vector regression, that may face scalability challenges for high-dimensional, time-dependent reliability problems, CoNBONet offers \textit{fast and energy-efficient inference} enabled by a neuroscience-inspired network architecture, \textit{calibrated uncertainty quantification with theoretical guarantees} via split conformal prediction, and \textit{strong generalization capability} through an operator-learning paradigm that maps input functions to system response trajectories. Validation of the proposed CoNBONet for various nonlinear dynamical systems demonstrates that CoNBONet preserves predictive fidelity, and achieves reliable coverage of failure probabilities, making it a powerful tool for robust and scalable reliability analysis in engineering design.


A mathematical framework for time-delay reservoir computing analysis

Clabaut, Anh-Tuan, Auriol, Jean, Boussaada, Islam, Mazanti, Guilherme

arXiv.org Machine Learning

Reservoir computing is a well-established approach for processing data with a much lower complexity compared to traditional neural networks. Despite two decades of experimental progress, the core properties of reservoir computing (namely separation, robustness, and fading memory) still lack rigorous mathematical foundations. This paper addresses this gap by providing a control-theoretic framework for the analysis of time-delay-based reservoir computers. We introduce formal definitions of the separation property and fading memory in terms of functional norms, and establish their connection to well-known stability notions for time-delay systems as incremental input-to-state stability. For a class of linear reservoirs, we derive an explicit lower bound for the separation distance via Fourier analysis, offering a computable criterion for reservoir design. Numerical results on the NARMA10 benchmark and continuous-time system prediction validate the approach with a minimal digital implementation.