Goto

Collaborating Authors

 Yang, Yunan


Invariant Measures in Time-Delay Coordinates for Unique Dynamical System Identification

arXiv.org Artificial Intelligence

Invariant measures are widely used to compare chaotic dynamical systems, as they offer robustness to noisy data, uncertain initial conditions, and irregular sampling. However, large classes of systems with distinct transient dynamics can still exhibit the same asymptotic statistical behavior, which poses challenges when invariant measures alone are used to perform system identification. Motivated by Takens' seminal embedding theory, we propose studying invariant measures in time-delay coordinates, which exhibit enhanced sensitivity to the underlying dynamics. Our first result demonstrates that a single invariant measure in time-delay coordinates can be used to perform system identification up to a topological conjugacy. This result already surpasses the capabilities of invariant measures in the original state coordinate. Continuing to explore the power of delay-coordinates, we eliminate all ambiguity from the conjugacy relation by showing that unique system identification can be achieved using additional invariant measures in time-delay coordinates constructed from different observables. Our findings improve the effectiveness of invariant measures in system identification and broaden the scope of measure-theoretic approaches to modeling dynamical systems.


Sampling with Adaptive Variance for Multimodal Distributions

arXiv.org Machine Learning

We propose and analyze a class of adaptive sampling algorithms for multimodal distributions on a bounded domain, which share a structural resemblance to the classic overdamped Langevin dynamics. We first demonstrate that this class of linear dynamics with adaptive diffusion coefficients and vector fields can be interpreted and analyzed as weighted Wasserstein gradient flows of the Kullback--Leibler (KL) divergence between the current distribution and the target Gibbs distribution, which directly leads to the exponential convergence of both the KL and $\chi^2$ divergences, with rates depending on the weighted Wasserstein metric and the Gibbs potential. We then show that a derivative-free version of the dynamics can be used for sampling without gradient information of the Gibbs potential and that for Gibbs distributions with nonconvex potentials, this approach could achieve significantly faster convergence than the classical overdamped Langevin dynamics. A comparison of the mean transition times between local minima of a nonconvex potential further highlights the better efficiency of the derivative-free dynamics in sampling.


Stochastic Inverse Problem: stability, regularization and Wasserstein gradient flow

arXiv.org Machine Learning

Inverse problems in physical or biological sciences often involve recovering an unknown parameter that is random. The sought-after quantity is a probability distribution of the unknown parameter, that produces data that aligns with measurements. Consequently, these problems are naturally framed as stochastic inverse problems. In this paper, we explore three aspects of this problem: direct inversion, variational formulation with regularization, and optimization via gradient flows, drawing parallels with deterministic inverse problems. A key difference from the deterministic case is the space in which we operate. Here, we work within probability space rather than Euclidean or Sobolev spaces, making tools from measure transport theory necessary for the study. Our findings reveal that the choice of metric -- both in the design of the loss function and in the optimization process -- significantly impacts the stability and properties of the optimizer.


Generative modeling of time-dependent densities via optimal transport and projection pursuit

arXiv.org Machine Learning

Such processes are visible in many applications ranging from geoscience, bioscience, and engineering to computer vision. In particular, deep learning algorithms, such as neural network parameterized normalizing flows, neural ordinary differential equations, diffusion models, and generative adversarial networks, have shown remarkable advances in learning and enabling rapid sampling from these stochastic processes. Such advances are further pronounced for very high-dimensional systems where classical methods are seen to saturate their effectiveness. However, the effective use of deep learning is frequently hampered by difficulties associated with computational cost as well as optimal hyperparameter selection. In this article, we propose a novel approach based on projection-pursuit optimal transport, which learns to sample from the densities of time-varying stochastic processes. It is competitive (both in terms of computational cost and accuracy) with a state-of-the-art deep learning algorithm (given by the neural spline flow). Crucially, our proposed method requires few hyperparameter choices by the user in contrast with most neural network-based methodologies. Thus, our main contributions to this work are as follows: 1. We implement a projection-pursuit optimal transport-based method to learn maps between time-varying densities from snapshots of particles sampled from these densities.


An Algebraically Converging Stochastic Gradient Descent Algorithm for Global Optimization

arXiv.org Artificial Intelligence

We propose a new gradient descent algorithm with added stochastic terms for finding the global optimizers of nonconvex optimization problems. A key component in the algorithm is the adaptive tuning of the randomness based on the value of the objective function. In the language of simulated annealing, the temperature is state-dependent. With this, we prove the global convergence of the algorithm with an algebraic rate both in probability and in the parameter space. This is a significant improvement over the classical rate from using a more straightforward control of the noise term. The convergence proof is based on the actual discrete setup of the algorithm, not just its continuous limit as often done in the literature. We also present several numerical examples to demonstrate the efficiency and robustness of the algorithm for reasonably complex objective functions.


Neural Inverse Operators for Solving PDE Inverse Problems

arXiv.org Artificial Intelligence

A large class of inverse problems for PDEs are only well-defined as mappings from operators to functions. Existing operator learning frameworks map functions to functions and need to be modified to learn inverse maps from data. We propose a novel architecture termed Neural Inverse Operators (NIOs) to solve these PDE inverse problems. Motivated by the underlying mathematical structure, NIO is based on a suitable composition of DeepONets and FNOs to approximate mappings from operators to functions. A variety of experiments are presented to demonstrate that NIOs significantly outperform baselines and solve PDE inverse problems robustly, accurately and are several orders of magnitude faster than existing direct and PDE-constrained optimization methods.


Adaptive State-Dependent Diffusion for Derivative-Free Optimization

arXiv.org Artificial Intelligence

This paper develops and analyzes a stochastic derivative-free optimization strategy. A key feature is the state-dependent adaptive variance. We prove global convergence in probability with algebraic rate and give the quantitative results in numerical examples. A striking fact is that convergence is achieved without explicit information of the gradient and even without comparing different objective function values as in established methods such as the simplex method and simulated annealing. It can otherwise be compared to annealing with state-dependent temperature.


Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based Optimization Problems

arXiv.org Artificial Intelligence

We propose efficient numerical schemes for implementing the natural gradient descent (NGD) for a broad range of metric spaces with applications to PDE-based optimization problems. Our technique represents the natural gradient direction as a solution to a standard least-squares problem. Hence, instead of calculating, storing, or inverting the information matrix directly, we apply efficient methods from numerical linear algebra. We treat both scenarios where the Jacobian, i.e., the derivative of the state variable with respect to the parameter, is either explicitly known or implicitly given through constraints. We can thus reliably compute several natural NGDs for a large-scale parameter space. In particular, we are able to compute Wasserstein NGD in thousands of dimensions, which was believed to be out of reach. Finally, our numerical results shed light on the qualitative differences between the standard gradient descent and various NGD methods based on different metric spaces in nonconvex optimization problems.