Viswanathan, Venkatasubramanian
Learning second-order TVD flux limiters using differentiable solvers
Huang, Chenyang, Sebastian, Amal S., Viswanathan, Venkatasubramanian
This paper presents a data-driven framework for learning optimal second-order total variation diminishing (TVD) flux limiters via differentiable simulations. In our fully differentiable finite volume solvers, the limiter functions are replaced by neural networks. By representing the limiter as a pointwise convex linear combination of the Minmod and Superbee limiters, we enforce both second-order accuracy and TVD constraints at all stages of training. Our approach leverages gradient-based optimization through automatic differentiation, allowing a direct backpropagation of errors from numerical solutions to the limiter parameters. We demonstrate the effectiveness of this method on various hyperbolic conservation laws, including the linear advection equation, the Burgers' equation, and the one-dimensional Euler equations. Remarkably, a limiter trained solely on linear advection exhibits strong generalizability, surpassing the accuracy of most classical flux limiters across a range of problems with shocks and discontinuities. The learned flux limiters can be readily integrated into existing computational fluid dynamics codes, and the proposed methodology also offers a flexible pathway to systematically develop and optimize flux limiters for complex flow problems.
Smirk: An Atomically Complete Tokenizer for Molecular Foundation Models
Wadell, Alexius, Bhutani, Anoushka, Viswanathan, Venkatasubramanian
Molecular Foundation Models are emerging as powerful tools for accelerating molecular design, material science, and cheminformatics, leveraging transformer architectures to speed up the discovery of new materials and drugs while reducing the computational cost of traditional ab initio methods. However, current models are constrained by closed-vocabulary tokenizers that fail to capture the full diversity of molecular structures. In this work, we systematically evaluate thirteen chemistry-specific tokenizers for their coverage of the SMILES language, uncovering substantial gaps. Using N-gram language models, we accessed the impact of tokenizer choice on model performance and quantified the information loss of unknown tokens. We introduce two new tokenizers, smirk and smirk-gpe, which can represent the entirety of the OpenSMILES specification while avoiding the pitfalls of existing tokenizers. Our work highlights the importance of open-vocabulary modeling for molecular foundation models and the need for chemically diverse benchmarks for cheminformatics.
Self-supervised Pretraining for Partial Differential Equations
Madhavan, Varun, Sebastian, Amal S, Ramsundar, Bharath, Viswanathan, Venkatasubramanian
In this work, we describe a novel approach to building a neural PDE solver leveraging recent advances in transformer based neural network architectures. Our model can provide solutions for different values of PDE parameters without any need for retraining the network. The training is carried out in a self-supervised manner, similar to pretraining approaches applied in language and vision tasks. We hypothesize that the model is in effect learning a family of operators (for multiple parameters) mapping the initial condition to the solution of the PDE at any future time step t. We compare this approach with the Fourier Neural Operator (FNO), and demonstrate that it can generalize over the space of PDE parameters, despite having a higher prediction error for individual parameter values compared to the FNO. We show that performance on a specific parameter can be improved by finetuning the model with very small amounts of data. We also demonstrate that the model scales with data as well as model size.
Differentiable Modeling and Optimization of Battery Electrolyte Mixtures Using Geometric Deep Learning
Zhu, Shang, Ramsundar, Bharath, Annevelink, Emil, Lin, Hongyi, Dave, Adarsh, Guan, Pin-Wen, Gering, Kevin, Viswanathan, Venkatasubramanian
Electrolytes play a critical role in designing next-generation battery systems, by allowing efficient ion transfer, preventing charge transfer, and stabilizing electrode-electrolyte interfaces. In this work, we develop a differentiable geometric deep learning (GDL) model for chemical mixtures, DiffMix, which is applied in guiding robotic experimentation and optimization towards fast-charging battery electrolytes. In particular, we extend mixture thermodynamic and transport laws by creating GDL-learnable physical coefficients. We evaluate our model with mixture thermodynamics and ion transport properties, where we show improved prediction accuracy and model robustness of DiffMix than its purely data-driven variants. Furthermore, with a robotic experimentation setup, Clio, we improve ionic conductivity of electrolytes by over 18.8% within 10 experimental steps, via differentiable optimization built on DiffMix gradients. By combining GDL, mixture physics laws, and robotic experimentation, DiffMix expands the predictive modeling methods for chemical mixtures and enables efficient optimization in large chemical spaces.
Differentiable Turbulence II
Shankar, Varun, Maulik, Romit, Viswanathan, Venkatasubramanian
Differentiable fluid simulators are increasingly demonstrating value as useful tools for developing data-driven models in computational fluid dynamics (CFD). Differentiable turbulence, or the end-to-end training of machine learning (ML) models embedded in CFD solution algorithms, captures both the generalization power and limited upfront cost of physics-based simulations, and the flexibility and automated training of deep learning methods. We develop a framework for integrating deep learning models into a generic finite element numerical scheme for solving the Navier-Stokes equations, applying the technique to learn a sub-grid scale closure using a multi-scale graph neural network. We demonstrate the method on several realizations of flow over a backwards-facing step, testing on both unseen Reynolds numbers and new geometry. We show that the learned closure can achieve accuracy comparable to traditional large eddy simulation on a finer grid that amounts to an equivalent speedup of 10x. As the desire and need for cheaper CFD simulations grows, we see hybrid physics-ML methods as a path forward to be exploited in the near future.
Differentiable Turbulence
Shankar, Varun, Maulik, Romit, Viswanathan, Venkatasubramanian
Deep learning is increasingly becoming a promising pathway to improving the accuracy of sub-grid scale (SGS) turbulence closure models for large eddy simulations (LES). We leverage the concept of differentiable turbulence, whereby an end-to-end differentiable solver is used in combination with physics-inspired choices of deep learning architectures to learn highly effective and versatile SGS models for two-dimensional turbulent flow. We perform an in-depth analysis of the inductive biases in the chosen architectures, finding that the inclusion of small-scale non-local features is most critical to effective SGS modeling, while large-scale features can improve pointwise accuracy of the a-posteriori solution field. The filtered velocity gradient tensor can be mapped directly to the SGS stress via decomposition of the inputs and outputs into isotropic, deviatoric, and anti-symmetric components. We see that the model can generalize to a variety of flow configurations, including higher and lower Reynolds numbers and different forcing conditions. We show that the differentiable physics paradigm is more successful than offline, a-priori learning, and that hybrid solver-in-the-loop approaches to deep learning offer an ideal balance between computational efficiency, accuracy, and generalization. Our experiments provide physics-based recommendations for deep-learning based SGS modeling for generalizable closure modeling of turbulence.
Accurate Surface and Finite Temperature Bulk Properties of Lithium Metal at Large Scales using Machine Learning Interaction Potentials
Phuthi, Mgcini Keith, Yao, Archie Mingze, Batzner, Simon, Musaelian, Albert, Kozinsky, Boris, Cubuk, Ekin Dogus, Viswanathan, Venkatasubramanian
The properties of lithium metal are key parameters in the design of lithium ion and lithium metal batteries. They are difficult to probe experimentally due to the high reactivity and low melting point of lithium as well as the microscopic scales at which lithium exists in batteries where it is found to have enhanced strength, with implications for dendrite suppression strategies. Computationally, there is a lack of empirical potentials that are consistently quantitatively accurate across all properties and ab-initio calculations are too costly. In this work, we train Machine Learning Interaction Potentials (MLIPs) on Density Functional Theory (DFT) data to state-of-the-art accuracy in reproducing experimental and ab-initio results across a wide range of simulations at large length and time scales. We accurately predict thermodynamic properties, phonon spectra, temperature dependence of elastic constants and various surface properties inaccessible using DFT. We establish that there exists a Bell-Evans-Polanyi relation correlating the self-adsorption energy and the minimum surface diffusion barrier for high Miller index facets.
Chemellia: An Ecosystem for Atomistic Scientific Machine Learning
Thazhemadam, Anant, Gandhi, Dhairya, Viswanathan, Venkatasubramanian, Kurchin, Rachel C.
Chemellia is an open-source framework for atomistic machine learning in the Julia programming language. The framework takes advantage of Julia's high speed as well as the ability to share and reuse code and interfaces through the paradigm of multiple dispatch. Chemellia is designed to make use of existing interfaces and avoid ``reinventing the wheel'' wherever possible. A key aspect of the Chemellia ecosystem is the ChemistryFeaturization interface for defining and encoding features -- it is designed to maximize interoperability between featurization schemes and elements thereof, to maintain provenance of encoded features, and to ensure easy decodability and reconfigurability to enable feature engineering experiments. This embodies the overall design principles of the Chemellia ecosystem: separation of concerns, interoperability, and transparency. We illustrate these principles by discussing the implementation of crystal graph convolutional neural networks for material property prediction.
Importance of equivariant and invariant symmetries for fluid flow modeling
Shankar, Varun, Barwey, Shivam, Kolter, Zico, Maulik, Romit, Viswanathan, Venkatasubramanian
Graph neural networks (GNNs) have shown promise in learning unstructured mesh-based simulations of physical systems, including fluid dynamics. In tandem, geometric deep learning principles have informed the development of equivariant architectures respecting underlying physical symmetries. However, the effect of rotational equivariance in modeling fluids remains unclear. We build a multi-scale equivariant GNN to forecast fluid flow and study the effect of modeling invariant and non-invariant representations of the flow state. We evaluate the model performance of several equivariant and non-equivariant architectures on predicting the evolution of two fluid flows, flow around a cylinder and buoyancy-driven shear flow, to understand the effect of equivariance and invariance on data-driven modeling approaches. Our results show that modeling invariant quantities produces more accurate long-term predictions and that these invariant quantities may be learned from the velocity field using a data-driven encoder.
Multiscale Graph Neural Network Autoencoders for Interpretable Scientific Machine Learning
Barwey, Shivam, Shankar, Varun, Viswanathan, Venkatasubramanian, Maulik, Romit
The goal of this work is to address two limitations in autoencoder-based models: latent space interpretability and compatibility with unstructured meshes. This is accomplished here with the development of a novel graph neural network (GNN) autoencoding architecture with demonstrations on complex fluid flow applications. To address the first goal of interpretability, the GNN autoencoder achieves reduction in the number nodes in the encoding stage through an adaptive graph reduction procedure. This reduction procedure essentially amounts to flowfield-conditioned node sampling and sensor identification, and produces interpretable latent graph representations tailored to the flowfield reconstruction task in the form of so-called masked fields. These masked fields allow the user to (a) visualize where in physical space a given latent graph is active, and (b) interpret the time-evolution of the latent graph connectivity in accordance with the time-evolution of unsteady flow features (e.g. recirculation zones, shear layers) in the domain. To address the goal of unstructured mesh compatibility, the autoencoding architecture utilizes a series of multi-scale message passing (MMP) layers, each of which models information exchange among node neighborhoods at various lengthscales. The MMP layer, which augments standard single-scale message passing with learnable coarsening operations, allows the decoder to more efficiently reconstruct the flowfield from the identified regions in the masked fields. Analysis of latent graphs produced by the autoencoder for various model settings are conducted using using unstructured snapshot data sourced from large-eddy simulations in a backward-facing step (BFS) flow configuration with an OpenFOAM-based flow solver at high Reynolds numbers.