Tartakovsky, Alexandre
Mathematics of Digital Twins and Transfer Learning for PDE Models
Zong, Yifei, Tartakovsky, Alexandre
We define a digital twin (DT) of a physical system governed by partial differential equations (PDEs) as a model for real-time simulations and control of the system behavior under changing conditions. We construct DTs using the Karhunen-Lo\`{e}ve Neural Network (KL-NN) surrogate model and transfer learning (TL). The surrogate model allows fast inference and differentiability with respect to control parameters for control and optimization. TL is used to retrain the model for new conditions with minimal additional data. We employ the moment equations to analyze TL and identify parameters that can be transferred to new conditions. The proposed analysis also guides the control variable selection in DT to facilitate efficient TL. For linear PDE problems, the non-transferable parameters in the KL-NN surrogate model can be exactly estimated from a single solution of the PDE corresponding to the mean values of the control variables under new target conditions. Retraining an ML model with a single solution sample is known as one-shot learning, and our analysis shows that the one-shot TL is exact for linear PDEs. For nonlinear PDE problems, transferring of any parameters introduces errors. For a nonlinear diffusion PDE model, we find that for a relatively small range of control variables, some surrogate model parameters can be transferred without introducing a significant error, some can be approximately estimated from the mean-field equation, and the rest can be found using a linear residual least square problem or an ordinary linear least square problem if a small labeled dataset for new conditions is available. The former approach results in a one-shot TL while the latter approach is an example of a few-shot TL. Both methods are approximate for the nonlinear PDEs.
Differentiable modeling to unify machine learning and physical models and advance Geosciences
Shen, Chaopeng, Appling, Alison P., Gentine, Pierre, Bandai, Toshiyuki, Gupta, Hoshin, Tartakovsky, Alexandre, Baity-Jesi, Marco, Fenicia, Fabrizio, Kifer, Daniel, Li, Li, Liu, Xiaofeng, Ren, Wei, Zheng, Yi, Harman, Ciaran J., Clark, Martyn, Farthing, Matthew, Feng, Dapeng, Kumar, Praveen, Aboelyazeed, Doaa, Rahmani, Farshid, Beck, Hylke E., Bindas, Tadd, Dwivedi, Dipankar, Fang, Kuai, Höge, Marvin, Rackauckas, Chris, Roy, Tirthankar, Xu, Chonggang, Mohanty, Binayak, Lawson, Kathryn
Process-Based Modeling (PBM) and Machine Learning (ML) are often perceived as distinct paradigms in the geosciences. Here we present differentiable geoscientific modeling as a powerful pathway toward dissolving the perceived barrier between them and ushering in a paradigm shift. For decades, PBM offered benefits in interpretability and physical consistency but struggled to efficiently leverage large datasets. ML methods, especially deep networks, presented strong predictive skills yet lacked the ability to answer specific scientific questions. While various methods have been proposed for ML-physics integration, an important underlying theme -- differentiable modeling -- is not sufficiently recognized. Here we outline the concepts, applicability, and significance of differentiable geoscientific modeling (DG). "Differentiable" refers to accurately and efficiently calculating gradients with respect to model variables, critically enabling the learning of high-dimensional unknown relationships. DG refers to a range of methods connecting varying amounts of prior knowledge to neural networks and training them together, capturing a different scope than physics-guided machine learning and emphasizing first principles. Preliminary evidence suggests DG offers better interpretability and causality than ML, improved generalizability and extrapolation capability, and strong potential for knowledge discovery, while approaching the performance of purely data-driven ML. DG models require less training data while scaling favorably in performance and efficiency with increasing amounts of data. With DG, geoscientists may be better able to frame and investigate questions, test hypotheses, and discover unrecognized linkages.
Physics-Informed Kriging: A Physics-Informed Gaussian Process Regression Method for Data-Model Convergence
Yang, Xiu, Tartakovsky, Guzel, Tartakovsky, Alexandre
In this work, we propose a new Gaussian process regression (GPR) method: physics-informed Kriging (PhIK). In the standard data-driven Kriging, the unknown function of interest is usually treated as a Gaussian process with assumed stationary covariance with hyperparameters estimated from data. In PhIK, we compute the mean and covariance function from realizations of available stochastic models, e.g., from realizations of governing stochastic partial differential equations solutions. Such a constructed Gaussian process generally is non-stationary, and does not assume a specific form of the covariance function. Our approach avoids the costly optimization step in data-driven GPR methods to identify the hyperparameters. More importantly, we prove that the physical constraints in the form of a deterministic linear operator are guaranteed in the resulting prediction. We also provide an error estimate in preserving the physical constraints when errors are included in the stochastic model realizations. To reduce the computational cost of obtaining stochastic model realizations, we propose a multilevel Monte Carlo estimate of the mean and covariance functions. Further, we present an active learning algorithm that guides the selection of additional observation locations. The efficiency and accuracy of PhIK are demonstrated for reconstructing a partially known modified Branin function and learning a conservative tracer distribution from sparse concentration measurements.