Goto

Collaborating Authors

 Maulik, Romit


Multiscale Graph Neural Network Autoencoders for Interpretable Scientific Machine Learning

arXiv.org Artificial Intelligence

The goal of this work is to address two limitations in autoencoder-based models: latent space interpretability and compatibility with unstructured meshes. This is accomplished here with the development of a novel graph neural network (GNN) autoencoding architecture with demonstrations on complex fluid flow applications. To address the first goal of interpretability, the GNN autoencoder achieves reduction in the number nodes in the encoding stage through an adaptive graph reduction procedure. This reduction procedure essentially amounts to flowfield-conditioned node sampling and sensor identification, and produces interpretable latent graph representations tailored to the flowfield reconstruction task in the form of so-called masked fields. These masked fields allow the user to (a) visualize where in physical space a given latent graph is active, and (b) interpret the time-evolution of the latent graph connectivity in accordance with the time-evolution of unsteady flow features (e.g. recirculation zones, shear layers) in the domain. To address the goal of unstructured mesh compatibility, the autoencoding architecture utilizes a series of multi-scale message passing (MMP) layers, each of which models information exchange among node neighborhoods at various lengthscales. The MMP layer, which augments standard single-scale message passing with learnable coarsening operations, allows the decoder to more efficiently reconstruct the flowfield from the identified regions in the masked fields. Analysis of latent graphs produced by the autoencoder for various model settings are conducted using using unstructured snapshot data sourced from large-eddy simulations in a backward-facing step (BFS) flow configuration with an OpenFOAM-based flow solver at high Reynolds numbers.


Stabilized Neural Ordinary Differential Equations for Long-Time Forecasting of Dynamical Systems

arXiv.org Artificial Intelligence

In data-driven modeling of spatiotemporal phenomena careful consideration often needs to be made in capturing the dynamics of the high wavenumbers. This problem becomes especially challenging when the system of interest exhibits shocks or chaotic dynamics. We present a data-driven modeling method that accurately captures shocks and chaotic dynamics by proposing a novel architecture, stabilized neural ordinary differential equation (ODE). In our proposed architecture, we learn the right-hand-side (RHS) of an ODE by adding the outputs of two NN together where one learns a linear term and the other a nonlinear term. Specifically, we implement this by training a sparse linear convolutional NN to learn the linear term and a dense fully-connected nonlinear NN to learn the nonlinear term. This is in contrast with the standard neural ODE which involves training only a single NN for learning the RHS. We apply this setup to the viscous Burgers equation, which exhibits shocked behavior, and show better short-time tracking and prediction of the energy spectrum at high wavenumbers than a standard neural ODE. We also find that the stabilized neural ODE models are much more robust to noisy initial conditions than the standard neural ODE approach. We also apply this method to chaotic trajectories of the Kuramoto-Sivashinsky equation. In this case, stabilized neural ODEs keep long-time trajectories on the attractor, and are highly robust to noisy initial conditions, while standard neural ODEs fail at achieving either of these results. We conclude by demonstrating how stabilizing neural ODEs provide a natural extension for use in reduced-order modeling by projecting the dynamics onto the eigenvectors of the learned linear term.


Assessments of epistemic uncertainty using Gaussian stochastic weight averaging for fluid-flow regression

arXiv.org Artificial Intelligence

We use Gaussian stochastic weight averaging (SWAG) to assess the model-form uncertainty associated with neural-network-based function approximation relevant to fluid flows. SWAG approximates a posterior Gaussian distribution of each weight, given training data, and a constant learning rate. Having access to this distribution, it is able to create multiple models with various combinations of sampled weights, which can be used to obtain ensemble predictions. The average of such an ensemble can be regarded as the `mean estimation', whereas its standard deviation can be used to construct `confidence intervals', which enable us to perform uncertainty quantification (UQ) with regard to the training process of neural networks. We utilize representative neural-network-based function approximation tasks for the following cases: (i) a two-dimensional circular-cylinder wake; (ii) the DayMET dataset (maximum daily temperature in North America); (iii) a three-dimensional square-cylinder wake; and (iv) urban flow, to assess the generalizability of the present idea for a wide range of complex datasets. SWAG-based UQ can be applied regardless of the network architecture, and therefore, we demonstrate the applicability of the method for two types of neural networks: (i) global field reconstruction from sparse sensors by combining convolutional neural network (CNN) and multi-layer perceptron (MLP); and (ii) far-field state estimation from sectional data with two-dimensional CNN. We find that SWAG can obtain physically-interpretable confidence-interval estimates from the perspective of model-form uncertainty. This capability supports its use for a wide range of problems in science and engineering.


Multi-fidelity reinforcement learning framework for shape optimization

arXiv.org Artificial Intelligence

Deep reinforcement learning (DRL) is a promising outer-loop intelligence paradigm which can deploy problem solving strategies for complex tasks. Consequently, DRL has been utilized for several scientific applications, specifically in cases where classical optimization or control methods are limited. One key limitation of conventional DRL methods is their episode-hungry nature which proves to be a bottleneck for tasks which involve costly evaluations of a numerical forward model. In this article, we address this limitation of DRL by introducing a controlled transfer learning framework that leverages a multi-fidelity simulation setting. Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers, where our framework can learn an optimal policy for generating efficient airfoil shapes by gathering knowledge from multi-fidelity environments and reduces computational costs by over 30\%. Furthermore, our formulation promotes policy exploration and generalization to new environments, thereby preventing over-fitting to data from solely one fidelity. Our results demonstrate this framework's applicability to other scientific DRL scenarios where multi-fidelity environments can be used for policy learning.


Learning the temporal evolution of multivariate densities via normalizing flows

arXiv.org Machine Learning

In this work, we propose a method to learn probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent mapping that takes a reference distribution (say, a Gaussian) to each and every instance of our evolving distribution. If the reference distribution is the initial condition of a Fokker-Planck equation, what we learn is the time-T map of the corresponding solution. Specifically, the learned map is a normalizing flow that deforms the support of the reference density to the support of each and every density snapshot in time. We demonstrate that this approach can learn solutions to non-local Fokker-Planck equations, such as those arising in systems driven by both Brownian and L\'evy noise. We present examples with two- and three-dimensional, uni- and multimodal distributions to validate the method.


Data-driven geophysical forecasting: Simple, low-cost, and accurate baselines with kernel methods

arXiv.org Machine Learning

Modeling geophysical systems as dynamical systems and regressing their vector field from data is a simple way to learn emulators for such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on observational data for the global sea-surface temperature, considerable gains are observed by the proposed technique in comparison to classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for temperatures in the North-American continent, we see significant improvements over climatology and persistence based forecast techniques.


Latent-space time evolution of non-intrusive reduced-order models using Gaussian process emulation

arXiv.org Machine Learning

Non-intrusive reduced-order models (ROMs) have recently generated considerable interest for constructing computationally efficient counterparts of nonlinear dynamical systems emerging from various domain sciences. They provide a low-dimensional emulation framework for systems that may be intrinsically high-dimensional. This is accomplished by utilizing a construction algorithm that is purely data-driven. It is no surprise, therefore, that the algorithmic advances of machine learning have led to non-intrusive ROMs with greater accuracy and computational gains. However, in bypassing the utilization of an equation-based evolution, it is often seen that the interpretability of the ROM framework suffers. This becomes more problematic when black-box deep learning methods are used which are notorious for lacking robustness outside the physical regime of the observed data. In this article, we propose the use of a novel latent-space interpolation algorithm based on Gaussian process regression. Notably, this reduced-order evolution of the system is parameterized by control parameters to allow for interpolation in space. The use of this procedure also allows for a continuous interpretation of time which allows for temporal interpolation. The latter aspect provides information, with quantified uncertainty, about full-state evolution at a finer resolution than that utilized for training the ROMs. We assess the viability of this algorithm for an advection-dominated system given by the inviscid shallow water equations.


Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models

arXiv.org Machine Learning

Rapid simulations of advection-dominated problems are vital for multiple engineering and geophysical applications. In this paper, we present a long short-term memory neural network to approximate the nonlinear component of the reduced-order model (ROM) of an advection-dominated partial differential equation. This is motivated by the fact that the nonlinear term is the most expensive component of a successful ROM. For our approach, we utilize a Galerkin projection to isolate the linear and the transient components of the dynamical system and then use discrete empirical interpolation to generate training data for supervised learning. We note that the numerical time-advancement and linear-term computation of the system ensures a greater preservation of physics than does a process that is fully modeled. Our results show that the proposed framework recovers transient dynamics accurately without nonlinear term computations in full-order space and represents a cost-effective alternative to solely equation-based ROMs.


Site-specific graph neural network for predicting protonation energy of oxygenate molecules

arXiv.org Machine Learning

Bio-oil molecule assessment is essential for the sustainable development of chemicals and transportation fuels. These oxygenated molecules have adequate carbon, hydrogen, and oxygen atoms that can be used for developing new value-added molecules (chemicals or transportation fuels). One motivation for our study stems from the fact that a liquid phase upgrading using mineral acid is a cost-effective chemical transformation. In this chemical upgrading process, adding a proton (positively charged atomic hydrogen) to an oxygen atom is a central step. The protonation energies of oxygen atoms in a molecule determine the thermodynamic feasibility of the reaction and likely chemical reaction pathway. A quantum chemical model based on coupled cluster theory is used to compute accurate thermochemical properties such as the protonation energies of oxygen atoms and the feasibility of protonation-based chemical transformations. However, this method is too computationally expensive to explore a large space of chemical transformations. We develop a graph neural network approach for predicting protonation energies of oxygen atoms of hundreds of bioxygenate molecules to predict the feasibility of aqueous acidic reactions. Our approach relies on an iterative local nonlinear embedding that gradually leads to global influence of distant atoms and a output layer that predicts the protonation energy. Our approach is geared to site-specific predictions for individual oxygen atoms of a molecule in comparison with commonly used graph convolutional networks that focus on a singular molecular property prediction. We demonstrate that our approach is effective in learning the location and magnitudes of protonation energies of oxygenated molecules.