Goto

Collaborating Authors

 Model-Based Reasoning


Differentiable Programming for Differential Equations: A Review

arXiv.org Machine Learning

The differentiable programming paradigm is a cornerstone of modern scientific computing. It refers to numerical methods for computing the gradient of a numerical model's output. Many scientific models are based on differential equations, where differentiable programming plays a crucial role in calculating model sensitivities, inverting model parameters, and training hybrid models that combine differential equations with data-driven approaches. Furthermore, recognizing the strong synergies between inverse methods and machine learning offers the opportunity to establish a coherent framework applicable to both fields. Differentiating functions based on the numerical solution of differential equations is non-trivial. Numerous methods based on a wide variety of paradigms have been proposed in the literature, each with pros and cons specific to the type of problem investigated. Here, we provide a comprehensive review of existing techniques to compute derivatives of numerical solutions of differential equations. We first discuss the importance of gradients of solutions of differential equations in a variety of scientific domains. Second, we lay out the mathematical foundations of the various approaches and compare them with each other. Third, we cover the computational considerations and explore the solutions available in modern scientific software. Last but not least, we provide best-practices and recommendations for practitioners. We hope that this work accelerates the fusion of scientific models and data, and fosters a modern approach to scientific modelling.


Path-Specific Causal Reasoning for Fairness-aware Cognitive Diagnosis

arXiv.org Artificial Intelligence

Cognitive Diagnosis~(CD), which leverages students and exercise data to predict students' proficiency levels on different knowledge concepts, is one of fundamental components in Intelligent Education. Due to the scarcity of student-exercise interaction data, most existing methods focus on making the best use of available data, such as exercise content and student information~(e.g., educational context). Despite the great progress, the abuse of student sensitive information has not been paid enough attention. Due to the important position of CD in Intelligent Education, employing sensitive information when making diagnosis predictions will cause serious social issues. Moreover, data-driven neural networks are easily misled by the shortcut between input data and output prediction, exacerbating this problem. Therefore, it is crucial to eliminate the negative impact of sensitive information in CD models. In response, we argue that sensitive attributes of students can also provide useful information, and only the shortcuts directly related to the sensitive information should be eliminated from the diagnosis process. Thus, we employ causal reasoning and design a novel Path-Specific Causal Reasoning Framework (PSCRF) to achieve this goal. Specifically, we first leverage an encoder to extract features and generate embeddings for general information and sensitive information of students. Then, we design a novel attribute-oriented predictor to decouple the sensitive attributes, in which fairness-related sensitive features will be eliminated and other useful information will be retained. Finally, we designed a multi-factor constraint to ensure the performance of fairness and diagnosis performance simultaneously. Extensive experiments over real-world datasets (e.g., PISA dataset) demonstrate the effectiveness of our proposed PSCRF.


PDP: Physics-Based Character Animation via Diffusion Policy

arXiv.org Artificial Intelligence

Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states, giving the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.


Physics and geometry informed neural operator network with application to acoustic scattering

arXiv.org Artificial Intelligence

In this paper, we introduce a physics and geometry informed neural operator network with application to the forward simulation of acoustic scattering. The development of geometry informed deep learning models capable of learning a solution operator for different computational domains is a problem of general importance for a variety of engineering applications. To this end, we propose a physics-informed deep operator network (DeepONet) capable of predicting the scattered pressure field for arbitrarily shaped scatterers using a geometric parameterization approach based on non-uniform rational B-splines (NURBS). This approach also results in parsimonious representations of non-trivial scatterer geometries. In contrast to existing physics-based approaches that require model re-evaluation when changing the computational domains, our trained model is capable of learning solution operator that can approximate physically-consistent scattered pressure field in just a few seconds for arbitrary rigid scatterer shapes; it follows that the computational time for forward simulations can improve (i.e. be reduced) by orders of magnitude in comparison to the traditional forward solvers. In addition, this approach can evaluate the scattered pressure field without the need for labeled training data. After presenting the theoretical approach, a comprehensive numerical study is also provided to illustrate the remarkable ability of this approach to simulate the acoustic pressure fields resulting from arbitrary combinations of arbitrary scatterer geometries. These results highlight the unique generalization capability of the proposed operator learning approach.


Exploring the Potential of Hybrid Machine-Learning/Physics-Based Modeling for Atmospheric/Oceanic Prediction Beyond the Medium Range

arXiv.org Artificial Intelligence

This paper explores the potential of a hybrid modeling approach that combines machine learning (ML) with conventional physics-based modeling for weather prediction beyond the medium range. It extends the work of Arcomano et al. (2022), which tested the approach for short- and medium-range weather prediction, and the work of Arcomano et al. (2023), which investigated its potential for climate modeling. The hybrid model used for the forecast experiments of the paper is based on the low-resolution, simplified parameterization atmospheric general circulation model (AGCM) SPEEDY. In addition to the hybridized prognostic variables of SPEEDY, the current version of the model has three purely ML-based prognostic variables. One of these is 6~h cumulative precipitation, another is the sea surface temperature, while the third is the heat content of the top 300 m deep layer of the ocean. The model has skill in predicting the El Ni\~no cycle and its global teleconnections with precipitation for 3-7 months depending on the season. The model captures equatorial variability of the precipitation associated with Kelvin and Rossby waves and MJO. Predictions of the precipitation in the equatorial region have skill for 15 days in the East Pacific and 11.5 days in the West Pacific. Though the model has low spatial resolution, for these tasks it has prediction skill comparable to what has been published for high-resolution, purely physics-based, conventional operational forecast models.


Physics-based material parameters extraction from perovskite experiments via Bayesian optimization

arXiv.org Artificial Intelligence

The ability to extract material parameters of perovskite from quantitative experimental analysis is essential for rational design of photovoltaic and optoelectronic applications. However, the difficulty of this analysis increases significantly with the complexity of the theoretical model and the number of material parameters for perovskite. Here we use Bayesian optimization to develop an analysis platform that can extract up to 8 fundamental material parameters of an organometallic perovskite semiconductor from a transient photoluminescence experiment, based on a complex full physics model that includes drift-diffusion of carriers and dynamic defect occupation. An example study of thermal degradation reveals that the carrier mobility and trap-assisted recombination coefficient are reduced noticeably, while the defect energy level remains nearly unchanged. The reduced carrier mobility can dominate the overall effect on thermal degradation of perovskite solar cells by reducing the fill factor, despite the opposite effect of the reduced trap-assisted recombination coefficient on increasing the fill factor. In future, this platform can be conveniently applied to other experiments or to combinations of experiments, accelerating materials discovery and optimization of semiconductor materials for photovoltaics and other applications.


Generating camera failures as a class of physics-based adversarial examples

arXiv.org Artificial Intelligence

While there has been extensive work on generating physics-based adversarial samples recently, an overlooked class of such samples come from physical failures in the camera. Camera failures can occur as a result of an external physical process, i.e. breakdown of a component due to stress, or an internal component failure. In this work, we develop a simulated physical process for generating broken lens as a class of physics-based adversarial samples. We create a stress-based physical simulation by generating particles constrained in a mesh and apply stress at a random point and at a random angle. We perform stress propagation through the mesh and the end result of the mesh is a corresponding image which simulates the broken lens pattern. We also develop a neural emulator which learns the non-linear mapping between the mesh as a graph and the stress propagation using constrained propagation setup. We can then statistically compare the difference between the generated adversarial samples with real, simulated and emulated adversarial examples using the detection failure rate of the different classes and in between the samples using the Frechet Inception distance. Our goal through this work is to provide a robust physics based process for generating adversarial samples.


A finite element-based physics-informed operator learning framework for spatiotemporal partial differential equations on arbitrary domains

arXiv.org Artificial Intelligence

We propose a novel finite element-based physics-informed operator learning framework that allows for predicting spatiotemporal dynamics governed by partial differential equations (PDEs). The proposed framework employs a loss function inspired by the finite element method (FEM) with the implicit Euler time integration scheme. A transient thermal conduction problem is considered to benchmark the performance. The proposed operator learning framework takes a temperature field at the current time step as input and predicts a temperature field at the next time step. The Galerkin discretized weak formulation of the heat equation is employed to incorporate physics into the loss function, which is coined finite operator learning (FOL). Upon training, the networks successfully predict the temperature evolution over time for any initial temperature field at high accuracy compared to the FEM solution. The framework is also confirmed to be applicable to a heterogeneous thermal conductivity and arbitrary geometry. The advantages of FOL can be summarized as follows: First, the training is performed in an unsupervised manner, avoiding the need for a large data set prepared from costly simulations or experiments. Instead, random temperature patterns generated by the Gaussian random process and the Fourier series, combined with constant temperature fields, are used as training data to cover possible temperature cases. Second, shape functions and backward difference approximation are exploited for the domain discretization, resulting in a purely algebraic equation. This enhances training efficiency, as one avoids time-consuming automatic differentiation when optimizing weights and biases while accepting possible discretization errors. Finally, thanks to the interpolation power of FEM, any arbitrary geometry can be handled with FOL, which is crucial to addressing various engineering application scenarios.


On Probabilistic and Causal Reasoning with Summation Operators

arXiv.org Artificial Intelligence

Ibeling et al. (2023). axiomatize increasingly expressive languages of causation and probability, and Mosse et al. (2024) show that reasoning (specifically the satisfiability problem) in each causal language is as difficult, from a computational complexity perspective, as reasoning in its merely probabilistic or "correlational" counterpart. Introducing a summation operator to capture common devices that appear in applications -- such as the $do$-calculus of Pearl (2009) for causal inference, which makes ample use of marginalization -- van der Zander et al. (2023) partially extend these earlier complexity results to causal and probabilistic languages with marginalization. We complete this extension, fully characterizing the complexity of probabilistic and causal reasoning with summation, demonstrating that these again remain equally difficult. Surprisingly, allowing free variables for random variable values results in a system that is undecidable, so long as the ranges of these random variables are unrestricted. We finally axiomatize these languages featuring marginalization (or more generally summation), resolving open questions posed by Ibeling et al. (2023).


Deep Neural Operator Enabled Digital Twin Modeling for Additive Manufacturing

arXiv.org Artificial Intelligence

A digital twin (DT), with the components of a physics-based model, a data-driven model, and a machine learning (ML) enabled efficient surrogate, behaves as a virtual twin of the real-world physical process. In terms of Laser Powder Bed Fusion (L-PBF) based additive manufacturing (AM), a DT can predict the current and future states of the melt pool and the resulting defects corresponding to the input laser parameters, evolve itself by assimilating in-situ sensor data, and optimize the laser parameters to mitigate defect formation. In this paper, we present a deep neural operator enabled computational framework of the DT for closed-loop feedback control of the L-PBF process. This is accomplished by building a high-fidelity computational model to accurately represent the melt pool states, an efficient surrogate model to approximate the melt pool solution field, followed by an physics-based procedure to extract information from the computed melt pool simulation that can further be correlated to the defect quantities of interest (e.g., surface roughness). In particular, we leverage the data generated from the high-fidelity physics-based model and train a series of Fourier neural operator (FNO) based ML models to effectively learn the relation between the input laser parameters and the corresponding full temperature field of the melt pool. Subsequently, a set of physics-informed variables such as the melt pool dimensions and the peak temperature can be extracted to compute the resulting defects. An optimization algorithm is then exercised to control laser input and minimize defects. On the other hand, the constructed DT can also evolve with the physical twin via offline finetuning and online material calibration. Finally, a probabilistic framework is adopted for uncertainty quantification. The developed DT is envisioned to guide the AM process and facilitate high-quality manufacturing.