Model-Based Reasoning
Exploring the Potential of Hybrid Machine-Learning/Physics-Based Modeling for Atmospheric/Oceanic Prediction Beyond the Medium Range
Patel, Dhruvit, Arcomano, Troy, Hunt, Brian, Szunyogh, Istvan, Ott, Edward
This paper explores the potential of a hybrid modeling approach that combines machine learning (ML) with conventional physics-based modeling for weather prediction beyond the medium range. It extends the work of Arcomano et al. (2022), which tested the approach for short- and medium-range weather prediction, and the work of Arcomano et al. (2023), which investigated its potential for climate modeling. The hybrid model used for the forecast experiments of the paper is based on the low-resolution, simplified parameterization atmospheric general circulation model (AGCM) SPEEDY. In addition to the hybridized prognostic variables of SPEEDY, the current version of the model has three purely ML-based prognostic variables. One of these is 6~h cumulative precipitation, another is the sea surface temperature, while the third is the heat content of the top 300 m deep layer of the ocean. The model has skill in predicting the El Ni\~no cycle and its global teleconnections with precipitation for 3-7 months depending on the season. The model captures equatorial variability of the precipitation associated with Kelvin and Rossby waves and MJO. Predictions of the precipitation in the equatorial region have skill for 15 days in the East Pacific and 11.5 days in the West Pacific. Though the model has low spatial resolution, for these tasks it has prediction skill comparable to what has been published for high-resolution, purely physics-based, conventional operational forecast models.
Physics-based material parameters extraction from perovskite experiments via Bayesian optimization
Zhan, Hualin, Ahmad, Viqar, Mayon, Azul, Tabi, Grace, Bui, Anh Dinh, Li, Zhuofeng, Walter, Daniel, Nguyen, Hieu, Weber, Klaus, White, Thomas, Catchpole, Kylie
The ability to extract material parameters of perovskite from quantitative experimental analysis is essential for rational design of photovoltaic and optoelectronic applications. However, the difficulty of this analysis increases significantly with the complexity of the theoretical model and the number of material parameters for perovskite. Here we use Bayesian optimization to develop an analysis platform that can extract up to 8 fundamental material parameters of an organometallic perovskite semiconductor from a transient photoluminescence experiment, based on a complex full physics model that includes drift-diffusion of carriers and dynamic defect occupation. An example study of thermal degradation reveals that the carrier mobility and trap-assisted recombination coefficient are reduced noticeably, while the defect energy level remains nearly unchanged. The reduced carrier mobility can dominate the overall effect on thermal degradation of perovskite solar cells by reducing the fill factor, despite the opposite effect of the reduced trap-assisted recombination coefficient on increasing the fill factor. In future, this platform can be conveniently applied to other experiments or to combinations of experiments, accelerating materials discovery and optimization of semiconductor materials for photovoltaics and other applications.
Generating camera failures as a class of physics-based adversarial examples
Prabhakar, Manav, Girnar, Jwalandhar, Kusari, Arpan
While there has been extensive work on generating physics-based adversarial samples recently, an overlooked class of such samples come from physical failures in the camera. Camera failures can occur as a result of an external physical process, i.e. breakdown of a component due to stress, or an internal component failure. In this work, we develop a simulated physical process for generating broken lens as a class of physics-based adversarial samples. We create a stress-based physical simulation by generating particles constrained in a mesh and apply stress at a random point and at a random angle. We perform stress propagation through the mesh and the end result of the mesh is a corresponding image which simulates the broken lens pattern. We also develop a neural emulator which learns the non-linear mapping between the mesh as a graph and the stress propagation using constrained propagation setup. We can then statistically compare the difference between the generated adversarial samples with real, simulated and emulated adversarial examples using the detection failure rate of the different classes and in between the samples using the Frechet Inception distance. Our goal through this work is to provide a robust physics based process for generating adversarial samples.
A finite element-based physics-informed operator learning framework for spatiotemporal partial differential equations on arbitrary domains
Yamazaki, Yusuke, Harandi, Ali, Muramatsu, Mayu, Viardin, Alexandre, Apel, Markus, Brepols, Tim, Reese, Stefanie, Rezaei, Shahed
We propose a novel finite element-based physics-informed operator learning framework that allows for predicting spatiotemporal dynamics governed by partial differential equations (PDEs). The proposed framework employs a loss function inspired by the finite element method (FEM) with the implicit Euler time integration scheme. A transient thermal conduction problem is considered to benchmark the performance. The proposed operator learning framework takes a temperature field at the current time step as input and predicts a temperature field at the next time step. The Galerkin discretized weak formulation of the heat equation is employed to incorporate physics into the loss function, which is coined finite operator learning (FOL). Upon training, the networks successfully predict the temperature evolution over time for any initial temperature field at high accuracy compared to the FEM solution. The framework is also confirmed to be applicable to a heterogeneous thermal conductivity and arbitrary geometry. The advantages of FOL can be summarized as follows: First, the training is performed in an unsupervised manner, avoiding the need for a large data set prepared from costly simulations or experiments. Instead, random temperature patterns generated by the Gaussian random process and the Fourier series, combined with constant temperature fields, are used as training data to cover possible temperature cases. Second, shape functions and backward difference approximation are exploited for the domain discretization, resulting in a purely algebraic equation. This enhances training efficiency, as one avoids time-consuming automatic differentiation when optimizing weights and biases while accepting possible discretization errors. Finally, thanks to the interpolation power of FEM, any arbitrary geometry can be handled with FOL, which is crucial to addressing various engineering application scenarios.
On Probabilistic and Causal Reasoning with Summation Operators
Ibeling, Duligur, Icard, Thomas F., Mossรฉ, Milan
Ibeling et al. (2023). axiomatize increasingly expressive languages of causation and probability, and Mosse et al. (2024) show that reasoning (specifically the satisfiability problem) in each causal language is as difficult, from a computational complexity perspective, as reasoning in its merely probabilistic or "correlational" counterpart. Introducing a summation operator to capture common devices that appear in applications -- such as the $do$-calculus of Pearl (2009) for causal inference, which makes ample use of marginalization -- van der Zander et al. (2023) partially extend these earlier complexity results to causal and probabilistic languages with marginalization. We complete this extension, fully characterizing the complexity of probabilistic and causal reasoning with summation, demonstrating that these again remain equally difficult. Surprisingly, allowing free variables for random variable values results in a system that is undecidable, so long as the ranges of these random variables are unrestricted. We finally axiomatize these languages featuring marginalization (or more generally summation), resolving open questions posed by Ibeling et al. (2023).
Using physics-based simulation towards eliminating empiricism in extraterrestrial terramechanics applications
Hu, Wei, Li, Pei, Rogg, Arno, Schepelmann, Alexander, Creager, Colin, Chandler, Samuel, Kamrin, Ken, Negrut, Dan
Recently, there has been a surge of international interest in extraterrestrial exploration targeting the Moon, Mars, the moons of Mars, and various asteroids. This contribution discusses how current state-of-the-art Earth-based testing for designing rovers and landers for these missions currently leads to overly optimistic conclusions about the behavior of these devices upon deployment on the targeted celestial bodies. The key misconception is that gravitational offset is necessary during the \textit{terramechanics} testing of rover and lander prototypes on Earth. The body of evidence supporting our argument is tied to a small number of studies conducted during parabolic flights and insights derived from newly revised scaling laws. We argue that what has prevented the community from fully diagnosing the problem at hand is the absence of effective physics-based models capable of simulating terramechanics under low gravity conditions. We developed such a physics-based simulator and utilized it to gauge the mobility of early prototypes of the Volatiles Investigating Polar Exploration Rover (VIPER), which is slated to depart for the Moon in November 2024. This contribution discusses the results generated by this simulator, how they correlate with physical test results from the NASA-Glenn SLOPE lab, and the fallacy of the gravitational offset in rover and lander testing. The simulator developed is open sourced and made publicly available for unfettered use; it can support principled studies that extend beyond trafficability analysis to provide insights into in-situ resource utilization activities, e.g., digging, bulldozing, and berming in low gravity.
Deep Neural Operator Enabled Digital Twin Modeling for Additive Manufacturing
Liu, Ning, Li, Xuxiao, Rajanna, Manoj R., Reutzel, Edward W., Sawyer, Brady, Rao, Prahalada, Lua, Jim, Phan, Nam, Yu, Yue
A digital twin (DT), with the components of a physics-based model, a data-driven model, and a machine learning (ML) enabled efficient surrogate, behaves as a virtual twin of the real-world physical process. In terms of Laser Powder Bed Fusion (L-PBF) based additive manufacturing (AM), a DT can predict the current and future states of the melt pool and the resulting defects corresponding to the input laser parameters, evolve itself by assimilating in-situ sensor data, and optimize the laser parameters to mitigate defect formation. In this paper, we present a deep neural operator enabled computational framework of the DT for closed-loop feedback control of the L-PBF process. This is accomplished by building a high-fidelity computational model to accurately represent the melt pool states, an efficient surrogate model to approximate the melt pool solution field, followed by an physics-based procedure to extract information from the computed melt pool simulation that can further be correlated to the defect quantities of interest (e.g., surface roughness). In particular, we leverage the data generated from the high-fidelity physics-based model and train a series of Fourier neural operator (FNO) based ML models to effectively learn the relation between the input laser parameters and the corresponding full temperature field of the melt pool. Subsequently, a set of physics-informed variables such as the melt pool dimensions and the peak temperature can be extracted to compute the resulting defects. An optimization algorithm is then exercised to control laser input and minimize defects. On the other hand, the constructed DT can also evolve with the physical twin via offline finetuning and online material calibration. Finally, a probabilistic framework is adopted for uncertainty quantification. The developed DT is envisioned to guide the AM process and facilitate high-quality manufacturing.
A review on data-driven constitutive laws for solids
Fuhg, Jan Niklas, Padmanabha, Govinda Anantha, Bouklas, Nikolaos, Bahmani, Bahador, Sun, WaiChing, Vlassis, Nikolaos N., Flaschel, Moritz, Carrara, Pietro, De Lorenzis, Laura
This review article highlights state-of-the-art data-driven techniques to discover, encode, surrogate, or emulate constitutive laws that describe the path-independent and path-dependent response of solids. Our objective is to provide an organized taxonomy to a large spectrum of methodologies developed in the past decades and to discuss the benefits and drawbacks of the various techniques for interpreting and forecasting mechanics behavior across different scales. Distinguishing between machine-learning-based and model-free methods, we further categorize approaches based on their interpretability and on their learning process/type of required data, while discussing the key problems of generalization and trustworthiness. We attempt to provide a road map of how these can be reconciled in a data-availability-aware context. We also touch upon relevant aspects such as data sampling techniques, design of experiments, verification, and validation.
Causal Evaluation of Language Models
Chen, Sirui, Peng, Bo, Chen, Meiqi, Wang, Ruiqi, Xu, Mengying, Zeng, Xingyu, Zhao, Rui, Zhao, Shengjie, Qiao, Yu, Lu, Chaochao
Causal reasoning is viewed as crucial for achieving human-level machine intelligence. Recent advances in language models have expanded the horizons of artificial intelligence across various domains, sparking inquiries into their potential for causal reasoning. In this work, we introduce Causal evaluation of Language Models (CaLM), which, to the best of our knowledge, is the first comprehensive benchmark for evaluating the causal reasoning capabilities of language models. First, we propose the CaLM framework, which establishes a foundational taxonomy consisting of four modules: causal target (i.e., what to evaluate), adaptation (i.e., how to obtain the results), metric (i.e., how to measure the results), and error (i.e., how to analyze the bad results). This taxonomy defines a broad evaluation design space while systematically selecting criteria and priorities. Second, we compose the CaLM dataset, comprising 126,334 data samples, to provide curated sets of causal targets, adaptations, metrics, and errors, offering extensive coverage for diverse research pursuits. Third, we conduct an extensive evaluation of 28 leading language models on a core set of 92 causal targets, 9 adaptations, 7 metrics, and 12 error types. Fourth, we perform detailed analyses of the evaluation results across various dimensions (e.g., adaptation, scale). Fifth, we present 50 high-level empirical findings across 9 dimensions (e.g., model), providing valuable guidance for future language model development. Finally, we develop a multifaceted platform, including a website, leaderboards, datasets, and toolkits, to support scalable and adaptable assessments. We envision CaLM as an ever-evolving benchmark for the community, systematically updated with new causal targets, adaptations, models, metrics, and error types to reflect ongoing research advancements. Project website is at https://opencausalab.github.io/CaLM.
Knowledge-guided Machine Learning: Current Trends and Future Prospects
Karpatne, Anuj, Jia, Xiaowei, Kumar, Vipin
This is especially true in environmental sciences that are rapidly transitioning from being data-poor to data-rich, e.g., with the ever-increasing volumes of environmental data being collected by Earth observing satellites, in-situ sensors, and those generated by model simulations (e.g., climate model runs [113]). Similar to how recent developments in ML has transformed how we interact with the information on the Internet, it is befitting to ask how ML advances can enable Earth system scientists to transform a fundamental goal in science, which is to build better models of physical, biological, and environmental systems. The conventional approach for modeling relationships between input drivers and response variables is to use process-based models rooted in scientific equations. Despite their ability to leverage the mechanistic understanding of scientific phenomena, process-based models suffer from several shortcomings limiting their adoption in complex real-world settings, e.g., due to imperfections in model formulations (or modeling bias), incorrect choices of parameter values in equations, and high computational costs in running high-fidelity simulations. In response to these challenges, ML methods offer a promising alternative to capture statistical relationships between inputs and outputs directly from data. However, "black-box" ML models, that solely rely on the supervision contained in data, show limited generalizability in scientific problems, especially when applied to out-of-distribution data. One of the reasons for this lack of generalizability is the limited scale of data in scientific disciplines in contrast to mainstream applications of AI and ML where large-scale datasets in computer vision and natural language modeling have been instrumental in the success of state-of-the-art AI/ML models. Another fundamental deficiency in black-box ML models is their tendency to produce results that are inconsistent with existing scientific theories and their inability to provide a mechanistic understanding of discovered patterns and relationships from data, limiting their usefulness in science.