Goto

Collaborating Authors

 Dasgupta, Agnimitra


Unifying and extending Diffusion Models through PDEs for solving Inverse Problems

arXiv.org Machine Learning

Diffusion models have emerged as powerful generative tools with applications in computer vision and scientific machine learning (SciML), where they have been used to solve large-scale probabilistic inverse problems. Traditionally, these models have been derived using principles of variational inference, denoising, statistical signal processing, and stochastic differential equations. In contrast to the conventional presentation, in this study we derive diffusion models using ideas from linear partial differential equations and demonstrate that this approach has several benefits that include a constructive derivation of the forward and reverse processes, a unified derivation of multiple formulations and sampling strategies, and the discovery of a new class of models. We also apply the conditional version of these models to solving canonical conditional density estimation problems and challenging inverse problems. These problems help establish benchmarks for systematically quantifying the performance of different formulations and sampling strategies in this study, and for future studies. Finally, we identify and implement a mechanism through which a single diffusion model can be applied to measurements obtained from multiple measurement operators. Taken together, the contents of this manuscript provide a new understanding and several new directions in the application of diffusion models to solving physics-based inverse problems.


Memorization and Regularization in Generative Diffusion Models

arXiv.org Artificial Intelligence

Diffusion models have emerged as a powerful framework for generative modeling. At the heart of the methodology is score matching: learning gradients of families of log-densities for noisy versions of the data distribution at different scales. When the loss function adopted in score matching is evaluated using empirical data, rather than the population loss, the minimizer corresponds to the score of a time-dependent Gaussian mixture. However, use of this analytically tractable minimizer leads to data memorization: in both unconditioned and conditioned settings, the generative model returns the training samples. This paper contains an analysis of the dynamical mechanism underlying memorization. The analysis highlights the need for regularization to avoid reproducing the analytically tractable minimizer; and, in so doing, lays the foundations for a principled understanding of how to regularize. Numerical experiments investigate the properties of: (i) Tikhonov regularization; (ii) regularization designed to promote asymptotic consistency; and (iii) regularizations induced by under-parameterization of a neural network or by early stopping when training a neural network. These experiments are evaluated in the context of memorization, and directions for future development of regularization are highlighted.


Conditional score-based diffusion models for solving inverse problems in mechanics

arXiv.org Machine Learning

We propose a framework to perform Bayesian inference using conditional score-based diffusion models to solve a class of inverse problems in mechanics involving the inference of a specimen's spatially varying material properties from noisy measurements of its mechanical response to loading. Conditional score-based diffusion models are generative models that learn to approximate the score function of a conditional distribution using samples from the joint distribution. More specifically, the score functions corresponding to multiple realizations of the measurement are approximated using a single neural network, the so-called score network, which is subsequently used to sample the posterior distribution using an appropriate Markov chain Monte Carlo scheme based on Langevin dynamics. Training the score network only requires simulating the forward model. Hence, the proposed approach can accommodate black-box forward models and complex measurement noise. Moreover, once the score network has been trained, it can be re-used to solve the inverse problem for different realizations of the measurements. We demonstrate the efficacy of the proposed approach on a suite of high-dimensional inverse problems in mechanics that involve inferring heterogeneous material properties from noisy measurements. Some examples we consider involve synthetic data, while others include data collected from actual elastography experiments. Further, our applications demonstrate that the proposed approach can handle different measurement modalities, complex patterns in the inferred quantities, non-Gaussian and non-additive noise models, and nonlinear black-box forward models. The results show that the proposed framework can solve large-scale physics-based inverse problems efficiently.


Diffusion Models for Generating Ballistic Spacecraft Trajectories

arXiv.org Artificial Intelligence

Generative modeling has drawn much attention in creative and scientific data generation tasks. Score-based Diffusion Models, a type of generative model that iteratively learns to denoise data, have shown state-of-the-art results on tasks such as image generation, multivariate time series forecasting, and robotic trajectory planning. We further analyze the model's ability to learn the characteristics of the original dataset and its ability to produce transfers that follow the underlying dynamics. Ablation studies were conducted to determine how model performance varies with model size and trajectory temporal resolution. In addition, a performance benchmark is designed to assess the generative model's usefulness for trajectory design, conduct model performance comparisons, and lay the groundwork for evaluating different generative models for trajectory design beyond diffusion. The results of this analysis showcase several useful properties of diffusion models that, when taken together, can enable a future system for generative trajectory design powered by diffusion models. INTRODUCTION Diffusion models are a type of generative model that have achieved state-of-the-art performance across creative and scientific domains. Concerning trajectory design, diffusion models have shown promising results in robotics. Janner et al. propose combining diffusion models with reinforcement learning techniques to develop flexible trajectory planning strategies.


Solution of physics-based inverse problems using conditional generative adversarial networks with full gradient penalty

arXiv.org Artificial Intelligence

The solution of probabilistic inverse problems for which the corresponding forward problem is constrained by physical principles is challenging. This is especially true if the dimension of the inferred vector is large and the prior information about it is in the form of a collection of samples. In this work, a novel deep learning based approach is developed and applied to solving these types of problems. The approach utilizes samples of the inferred vector drawn from the prior distribution and a physics-based forward model to generate training data for a conditional Wasserstein generative adversarial network (cWGAN). The cWGAN learns the probability distribution for the inferred vector conditioned on the measurement and produces samples from this distribution. The cWGAN developed in this work differs from earlier versions in that its critic is required to be 1-Lipschitz with respect to both the inferred and the measurement vectors and not just the former. This leads to a loss term with the full (and not partial) gradient penalty. It is shown that this rather simple change leads to a stronger notion of convergence for the conditional density learned by the cWGAN and a more robust and accurate sampling strategy. Through numerical examples it is shown that this change also translates to better accuracy when solving inverse problems. The numerical examples considered include illustrative problems where the true distribution and/or statistics are known, and a more complex inverse problem motivated by applications in biomechanics.


Uncertainty quantification for ptychography using normalizing flows

arXiv.org Machine Learning

Ptychography, as an essential tool for high-resolution and nondestructive material characterization, presents a challenging large-scale nonlinear and non-convex inverse problem; however, its intrinsic photon statistics create clear opportunities for statistical-based deep learning approaches to tackle these challenges, which has been underexplored. In this work, we explore normalizing flows to obtain a surrogate for the high-dimensional posterior, which also enables the characterization of the uncertainty associated with the reconstruction: an extremely desirable capability when judging the reconstruction quality in the absence of ground truth, spotting spurious artifacts and guiding future experiments using the returned uncertainty patterns. We demonstrate the performance of the proposed method on a synthetic sample with added noise and in various physical experimental settings.


Gaussian Process for Tomography

arXiv.org Machine Learning

Tomographic imaging refers to the reconstruction of a 3D object from its 2D projections by sectioning the object, through the use of any kind of penetrating wave, from many different directions. It has had a revolutionary impact in a number of fields ranging from biology, physics, and chemistry to astronomy [1, 2]. The technique requires an accurate image reconstruction, however, and the resulting reconstruction problem is an ill-posed optimization problem because of insufficient measurements [3]. A direct consequence of ill-posedness is that the reconstruction does not have a unique solution. Therefore, quantifying the solution quality is challenging, given the absence of ground truth and the presence of measurement noise. Moreover, ill-posedness creates a requirement for regularization that imports new parameters to the problem. Regularization parameter choice can lead to substantial variations in reconstruction, and ascertaining optimal values of such parameters is difficult without availing oneself of ground truth [4]. The transition from an optimization perspective on tomographic inversion to a Bayesian statistical perspective can provide a useful reframing of these issues. In particular, the ill-posedness of the optimization view can be replaced by quantified uncertainty in the statistical view, whereas regularization now appears in the guise of parameter estimation.