Lusch, Bethany
Scalable and Consistent Graph Neural Networks for Distributed Mesh-based Data-driven Modeling
Barwey, Shivam, Balin, Riccardo, Lusch, Bethany, Patel, Saumil, Balakrishnan, Ramesh, Pal, Pinaki, Maulik, Romit, Vishwanath, Venkatram
This work develops a distributed graph neural network (GNN) methodology for mesh-based modeling applications using a consistent neural message passing layer. As the name implies, the focus is on enabling scalable operations that satisfy physical consistency via halo nodes at sub-graph boundaries. Here, consistency refers to the fact that a GNN trained and evaluated on one rank (one large graph) is arithmetically equivalent to evaluations on multiple ranks (a partitioned graph). This concept is demonstrated by interfacing GNNs with NekRS, a GPU-capable exascale CFD solver developed at Argonne National Laboratory. It is shown how the NekRS mesh partitioning can be linked to the distributed GNN training and inference routines, resulting in a scalable mesh-based data-driven modeling workflow. We study the impact of consistency on the scalability of mesh-based GNNs, demonstrating efficient scaling in consistent GNNs for up to O(1B) graph nodes on the Frontier exascale supercomputer.
Computationally Efficient Data-Driven Discovery and Linear Representation of Nonlinear Systems For Control
Tiwari, Madhur, Nehma, George, Lusch, Bethany
This work focuses on developing a data-driven framework using Koopman operator theory for system identification and linearization of nonlinear systems for control. Our proposed method presents a deep learning framework with recursive learning. The resulting linear system is controlled using a linear quadratic control. An illustrative example using a pendulum system is presented with simulations on noisy data. We show that our proposed method is trained more efficiently and is more accurate than an autoencoder baseline.
Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models
Maulik, Romit, Rao, Vishwas, Madireddy, Sandeep, Lusch, Bethany, Balaprakash, Prasanna
Rapid simulations of advection-dominated problems are vital for multiple engineering and geophysical applications. In this paper, we present a long short-term memory neural network to approximate the nonlinear component of the reduced-order model (ROM) of an advection-dominated partial differential equation. This is motivated by the fact that the nonlinear term is the most expensive component of a successful ROM. For our approach, we utilize a Galerkin projection to isolate the linear and the transient components of the dynamical system and then use discrete empirical interpolation to generate training data for supervised learning. We note that the numerical time-advancement and linear-term computation of the system ensures a greater preservation of physics than does a process that is fully modeled. Our results show that the proposed framework recovers transient dynamics accurately without nonlinear term computations in full-order space and represents a cost-effective alternative to solely equation-based ROMs.
Deep learning for universal linear embeddings of nonlinear dynamics
Lusch, Bethany, Kutz, J. Nathan, Brunton, Steven L.
Identifying coordinate transformations that make strongly nonlinear dynamics approximately linear is a central challenge in modern dynamical systems. These transformations have the potential to enable prediction, estimation, and control of nonlinear systems using standard linear theory. The Koopman operator has emerged as a leading data-driven embedding, as eigenfunctions of this operator provide intrinsic coordinates that globally linearize the dynamics. However, identifying and representing these eigenfunctions has proven to be mathematically and computationally challenging. This work leverages the power of deep learning to discover representations of Koopman eigenfunctions from trajectory data of dynamical systems. Our network is parsimonious and interpretable by construction, embedding the dynamics on a low-dimensional manifold that is of the intrinsic rank of the dynamics and parameterized by the Koopman eigenfunctions. In particular, we identify nonlinear coordinates on which the dynamics are globally linear using a modified auto-encoder. We also generalize Koopman representations to include a ubiquitous class of systems that exhibit continuous spectra, ranging from the simple pendulum to nonlinear optics and broadband turbulence. Our framework parametrizes the continuous frequency using an auxiliary network, enabling a compact and efficient embedding at the intrinsic rank, while connecting our models to half a century of asymptotics. In this way, we benefit from the power and generality of deep learning, while retaining the physical interpretability of Koopman embeddings.
Modeling cognitive deficits following neurodegenerative diseases and traumatic brain injuries with deep convolutional neural networks
Lusch, Bethany, Weholt, Jake, Maia, Pedro D., Kutz, J. Nathan
The accurate diagnosis and assessment of neurodegenerative disease and traumatic brain injuries (TBI) remain open challenges. Both cause cognitive and functional deficits due to focal axonal swellings (FAS), but it is difficult to deliver a prognosis due to our limited ability to assess damaged neurons at a cellular level in vivo. We simulate the effects of neurodegenerative disease and TBI using convolutional neural networks (CNNs) as our model of cognition. We utilize biophysically relevant statistical data on FAS to damage the connections in CNNs in a functionally relevant way. We incorporate energy constraints on the brain by pruning the CNNs to be less over-engineered. Qualitatively, we demonstrate that damage leads to human-like mistakes. Our experiments also provide quantitative assessments of how accuracy is affected by various types and levels of damage. The deficit resulting from a fixed amount of damage greatly depends on which connections are randomly injured, providing intuition for why it is difficult to predict impairments. There is a large degree of subjectivity when it comes to interpreting cognitive deficits from complex systems such as the human brain. However, we provide important insight and a quantitative framework for disorders in which FAS are implicated.
Shape Constrained Tensor Decompositions using Sparse Representations in Over-Complete Libraries
Lusch, Bethany, Chi, Eric C., Kutz, J. Nathan
We consider $N$-way data arrays and low-rank tensor factorizations where the time mode is coded as a sparse linear combination of temporal elements from an over-complete library. Our method, Shape Constrained Tensor Decomposition (SCTD) is based upon the CANDECOMP/PARAFAC (CP) decomposition which produces $r$-rank approximations of data tensors via outer products of vectors in each dimension of the data. By constraining the vector in the temporal dimension to known analytic forms which are selected from a large set of candidate functions, more readily interpretable decompositions are achieved and analytic time dependencies discovered. The SCTD method circumvents traditional {\em flattening} techniques where an $N$-way array is reshaped into a matrix in order to perform a singular value decomposition. A clear advantage of the SCTD algorithm is its ability to extract transient and intermittent phenomena which is often difficult for SVD-based methods. We motivate the SCTD method using several intuitively appealing results before applying it on a number of high-dimensional, real-world data sets in order to illustrate the efficiency of the algorithm in extracting interpretable spatio-temporal modes. With the rise of data-driven discovery methods, the decomposition proposed provides a viable technique for analyzing multitudes of data in a more comprehensible fashion.
Submodular Hamming Metrics
Gillenwater, Jennifer A., Iyer, Rishabh K., Lusch, Bethany, Kidambi, Rahul, Bilmes, Jeff A.
We show that there is a largely unexplored class of functions (positive polymatroids) that can define proper discrete metrics over pairs of binary vectors and that are fairly tractable to optimize over. By exploiting submodularity, we are able to give hardness results and approximation algorithms for optimizing over such metrics. Additionally, we demonstrate empirically the effectiveness of these metrics and associated algorithms on both a metric minimization task (a form of clustering) and also a metric maximization task (generating diverse k-best lists).