Safta, Cosmin
Uncertainty quantification of neural network models of evolving processes via Langevin sampling
Safta, Cosmin, Jones, Reese E., Patel, Ravi G., Wonnacot, Raelynn, Bolintineanu, Dan S., Hamel, Craig M., Kramer, Sharlotte L. B.
We propose a scalable, approximate inference hypernetwork framework for a general model of history-dependent processes. The flexible data model is based on a neural ordinary differential equation (NODE) representing the evolution of internal states together with a trainable observation model subcomponent. The posterior distribution corresponding to the data model parameters (weights and biases) follows a stochastic differential equation with a drift term related to the score of the posterior that is learned jointly with the data model parameters. This Langevin sampling approach offers flexibility in balancing the computational budget between the evaluation cost of the data model and the approximation of the posterior density of its parameters. We demonstrate performance of the hypernetwork on chemical reaction and material physics data and compare it to mean-field variational inference.
Advancing calibration for stochastic agent-based models in epidemiology with Stein variational inference and Gaussian process surrogates
Robertson, Connor, Safta, Cosmin, Collier, Nicholson, Ozik, Jonathan, Ray, Jaideep
Accurate calibration of stochastic agent-based models (ABMs) in epidemiology is crucial to make them useful in public health policy decisions and interventions. Traditional calibration methods, e.g., Markov Chain Monte Carlo (MCMC), that yield a probability density function for the parameters being calibrated, are often computationally expensive. When applied to ABMs which are highly parametrized, the calibration process becomes computationally infeasible. This paper investigates the utility of Stein Variational Inference (SVI) as an alternative calibration technique for stochastic epidemiological ABMs approximated by Gaussian process (GP) surrogates. SVI leverages gradient information to iteratively update a set of particles in the space of parameters being calibrated, offering potential advantages in scalability and efficiency for high-dimensional ABMs. The ensemble of particles yields a joint probability density function for the parameters and serves as the calibration. We compare the performance of SVI and MCMC in calibrating CityCOVID, a stochastic epidemiological ABM, focusing on predictive accuracy and calibration effectiveness. Our results demonstrate that SVI maintains predictive accuracy and calibration effectiveness comparable to MCMC, making it a viable alternative for complex epidemiological models. We also present the practical challenges of using a gradient-based calibration such as SVI which include careful tuning of hyperparameters and monitoring of the particle dynamics.
Condensed Stein Variational Gradient Descent for Uncertainty Quantification of Neural Networks
Padmanabha, Govinda Anantha, Safta, Cosmin, Bouklas, Nikolaos, Jones, Reese E.
In the context of uncertainty quantification (UQ) the curse of dimensionality, whereby quantification efficiency degrades drastistically with parameter dimension, is particular extreme with highly parameterized models such as neural networks (NNs). Fortunately, in many cases, these models are overparameterized in the sense that the number of parameters can be reduced with negligible effects on accuracy and sometimes improvements in generalization [1]. Furthermore, NNs often have parameterizations that have fungible parameters such that permutations of the values and connections lead to equivalent output responses. This suggests methods that simultaneously sparsify and characterize the uncertainty of a model, while handling and taking advantage of the symmetries inherent in the model, are potentially advantageous approaches. Although Markov chain Monte Carlo (MCMC) methods [2] have been the reference standard to generate samples for UQ methods, they can be temperamental and do not scale well for high dimensional models. More recently, there has been widespread use of variational inference methods (VI), which cast the parameter posterior sampling problem as an optimization of a surrogate posterior guided by a suitable objective, such as the Kullback-Liebler (KL) divergence between the predictive posterior and true posterior induced by the data. In particular, there is now a family of model ensemble methods based on Stein's identity [3], such as Stein variational gradient descent (SVGD) [4], projected SVGD [5], and Stein variational Newton's method [6]. These methods have advantages over MCMC methods by virtue of propagating in parallel a coordinated ensemble of particles that represent the empirical posterior.
A switching Kalman filter approach to online mitigation and correction of sensor corruption for inertial navigation
Mustaev, Artem, Galioto, Nicholas, Boler, Matt, Jakeman, John D., Safta, Cosmin, Gorodetsky, Alex
This paper introduces a novel approach to detect and address faulty or corrupted external sensors in the context of inertial navigation by leveraging a switching Kalman Filter combined with parameter augmentation. Instead of discarding the corrupted data, the proposed method retains and processes it, running multiple observation models simultaneously and evaluating their likelihoods to accurately identify the true state of the system. We demonstrate the effectiveness of this approach to both identify the moment that a sensor becomes faulty and to correct for the resulting sensor behavior to maintain accurate estimates. We demonstrate our approach on an application of balloon navigation in the atmosphere and shuttle reentry. The results show that our method can accurately recover the true system state even in the presence of significant sensor bias, thereby improving the robustness and reliability of state estimation systems under challenging conditions. We also provide a statistical analysis of problem settings to determine when and where our method is most accurate and where it fails.
Bayesian calibration of stochastic agent based model via random forest
Robertson, Connor, Safta, Cosmin, Collier, Nicholson, Ozik, Jonathan, Ray, Jaideep
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high dimensional calibration can be computationally prohibitive. This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results and their predictive performance is analyzed showing improved performance with a reduction in computation.
Equivariant graph convolutional neural networks for the representation of homogenized anisotropic microstructural mechanical response
Patel, Ravi, Safta, Cosmin, Jones, Reese E.
Composite materials with different microstructural material symmetries are common in engineering applications where grain structure, alloying and particle/fiber packing are optimized via controlled manufacturing. In fact these microstructural tunings can be done throughout a part to achieve functional gradation and optimization at a structural level. To predict the performance of particular microstructural configuration and thereby overall performance, constitutive models of materials with microstructure are needed. In this work we provide neural network architectures that provide effective homogenization models of materials with anisotropic components. These models satisfy equivariance and material symmetry principles inherently through a combination of equivariant and tensor basis operations. We demonstrate them on datasets of stochastic volume elements with different textures and phases where the material undergoes elastic and plastic deformation, and show that the these network architectures provide significant performance improvements.
Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes
Hauth, Jeremiah, Safta, Cosmin, Huan, Xun, Patel, Ravi G., Jones, Reese E.
The application of neural network models to scientific machine learning tasks has proliferated in recent years. In particular, neural network models have proved to be adept at modeling processes with spatial-temporal complexity. Nevertheless, these highly parameterized models have garnered skepticism in their ability to produce outputs with quantified error bounds over the regimes of interest. Hence there is a need to find uncertainty quantification methods that are suitable for neural networks. In this work we present comparisons of the parametric uncertainty quantification of neural networks modeling complex spatial-temporal processes with Hamiltonian Monte Carlo and Stein variational gradient descent and its projected variant. Specifically we apply these methods to graph convolutional neural network models of evolving systems modeled with recurrent neural network and neural ordinary differential equations architectures. We show that Stein variational inference is a viable alternative to Monte Carlo methods with some clear advantages for complex neural network models. For our exemplars, Stein variational interference gave similar uncertainty profiles through time compared to Hamiltonian Monte Carlo, albeit with generally more generous variance.Projected Stein variational gradient descent also produced similar uncertainty profiles to the non-projected counterpart, but large reductions in the active weight space were confounded by the stability of the neural network predictions and the convoluted likelihood landscape.
Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy
Bridgman, Wyatt, Balakrishnan, Uma, Jones, Reese, Chen, Jiefu, Wu, Xuqing, Safta, Cosmin, Huang, Yueqin, Khalil, Mohammad
In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using linear regression, which is also known as point collocation or stochastic response surfaces. Regression exhibits better scalability and can handle noisy function evaluations in contrast to other non-intrusive approaches, such as projection. However, since over-sampling is generally advisable for the linear regression approach, the simulation requirements become prohibitive for expensive forward models. We propose to leverage transfer learning whereby knowledge gained through similar PCE surrogate construction tasks (source domains) is transferred to a new surrogate-construction task (target domain) which has a limited number of forward model simulations (training data). The proposed transfer learning strategy determines how much, if any, information to transfer using new techniques inspired by Bayesian modeling and data assimilation. The strategy is scrutinized using numerical investigations and applied to an engineering problem from the oil and gas industry.
A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges
Swiler, Laura, Gulian, Mamikon, Frankel, Ari, Safta, Cosmin, Jakeman, John
Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints.
Compressive sensing adaptation for polynomial chaos expansions
Tsilifis, Panagiotis, Huan, Xun, Safta, Cosmin, Sargsyan, Khachik, Lacaze, Guilhem, Oefelein, Joseph C., Najm, Habib N., Ghanem, Roger G.
Basis adaptation in Homogeneous Chaos spaces rely on a suitable rotation of the underlying Gaussian germ. Several rotations have been proposed in the literature resulting in adaptations with different convergence properties. In this paper we present a new adaptation mechanism that builds on compressive sensing algorithms, resulting in a reduced polynomial chaos approximation with optimal sparsity. The developed adaptation algorithm consists of a two-step optimization procedure that computes the optimal coefficients and the input projection matrix of a low dimensional chaos expansion with respect to an optimally rotated basis. We demonstrate the attractive features of our algorithm through several numerical examples including the application on Large-Eddy Simulation (LES) calculations of turbulent combustion in a HIFiRE scramjet engine.