Goto

Collaborating Authors

 Bilionis, Ilias


An interpretation of the Brownian bridge as a physics-informed prior for the Poisson equation

arXiv.org Machine Learning

Physics-informed machine learning is one of the most commonly used methods for fusing physical knowledge in the form of partial differential equations with experimental data. The idea is to construct a loss function where the physical laws take the place of a regularizer and minimize it to reconstruct the underlying physical fields and any missing parameters. However, there is a noticeable lack of a direct connection between physics-informed loss functions and an overarching Bayesian framework. In this work, we demonstrate that Brownian bridge Gaussian processes can be viewed as a softly-enforced physics-constrained prior for the Poisson equation. We first show equivalence between the variational form of the physics-informed loss function for the Poisson equation and a kernel ridge regression objective. Then, through the connection between Gaussian process regression and kernel methods, we identify a Gaussian process for which the posterior mean function and physics-informed loss function minimizer agree. This connection allows us to probe different theoretical questions, such as convergence and behavior of inverse problems. We also connect the method to the important problem of identifying model-form error in applications.


A Causal Graph-Enhanced Gaussian Process Regression for Modeling Engine-out NOx

arXiv.org Artificial Intelligence

The stringent regulatory requirements on nitrogen oxides (NOx) emissions from diesel compression ignition engines require accurate and reliable models for real-time monitoring and diagnostics. Although traditional methods such as physical sensors and virtual engine control module (ECM) sensors provide essential data, they are only used for estimation. Ubiquitous literature primarily focuses on deterministic models with little emphasis on capturing the uncertainties due to sensors. The lack of probabilistic frameworks restricts the applicability of these models for robust diagnostics. The objective of this paper is to develop and validate a probabilistic model to predict engine-out NOx emissions using Gaussian process regression. Our approach is as follows. We employ three variants of Gaussian process models: the first with a standard radial basis function kernel with input window, the second incorporating a deep kernel using convolutional neural networks to capture temporal dependencies, and the third enriching the deep kernel with a causal graph derived via graph convolutional networks. The causal graph embeds physics knowledge into the learning process. All models are compared against a virtual ECM sensor using both quantitative and qualitative metrics. We conclude that our model provides an improvement in predictive performance when using an input window and a deep kernel structure. Even more compelling is the further enhancement achieved by the incorporation of a causal graph into the deep kernel. These findings are corroborated across different validation datasets.


Generative Hyperelasticity with Physics-Informed Probabilistic Diffusion Fields

arXiv.org Artificial Intelligence

Many natural materials exhibit highly complex, nonlinear, anisotropic, and heterogeneous mechanical properties. Recently, it has been demonstrated that data-driven strain energy functions possess the flexibility to capture the behavior of these complex materials with high accuracy while satisfying physics-based constraints. However, most of these approaches disregard the uncertainty in the estimates and the spatial heterogeneity of these materials. In this work, we leverage recent advances in generative models to address these issues. We use as building block neural ordinary equations (NODE) that -- by construction -- create polyconvex strain energy functions, a key property of realistic hyperelastic material models. We combine this approach with probabilistic diffusion models to generate new samples of strain energy functions. This technique allows us to sample a vector of Gaussian white noise and translate it to NODE parameters thereby representing plausible strain energy functions. We extend our approach to spatially correlated diffusion resulting in heterogeneous material properties for arbitrary geometries. We extensively test our method with synthetic and experimental data on biological tissues and run finite element simulations with various degrees of spatial heterogeneity. We believe this approach is a major step forward including uncertainty in predictive, data-driven models of hyperelasticity


An information field theory approach to Bayesian state and parameter estimation in dynamical systems

arXiv.org Artificial Intelligence

Dynamical system state estimation and parameter calibration problems are ubiquitous across science and engineering. Bayesian approaches to the problem are the gold standard as they allow for the quantification of uncertainties and enable the seamless fusion of different experimental modalities. When the dynamics are discrete and stochastic, one may employ powerful techniques such as Kalman, particle, or variational filters. Practitioners commonly apply these methods to continuous-time, deterministic dynamical systems after discretizing the dynamics and introducing fictitious transition probabilities. However, approaches based on time-discretization suffer from the curse of dimensionality since the number of random variables grows linearly with the number of time-steps. Furthermore, the introduction of fictitious transition probabilities is an unsatisfactory solution because it increases the number of model parameters and may lead to inference bias. To address these drawbacks, the objective of this paper is to develop a scalable Bayesian approach to state and parameter estimation suitable for continuous-time, deterministic dynamical systems. Our methodology builds upon information field theory. Specifically, we construct a physics-informed prior probability measure on the function space of system responses so that functions that satisfy the physics are more likely. This prior allows us to quantify model form errors. We connect the system's response to observations through a probabilistic model of the measurement process. The joint posterior over the system responses and all parameters is given by Bayes' rule. To approximate the intractable posterior, we develop a stochastic variational inference algorithm. In summary, the developed methodology offers a powerful framework for Bayesian estimation in dynamical systems.


Learning to solve Bayesian inverse problems: An amortized variational inference approach

arXiv.org Artificial Intelligence

Inverse problems, i.e., estimating parameters of physical models from experimental data, are ubiquitous in science and engineering. The Bayesian formulation is the gold standard because it alleviates ill-posedness issues and quantifies epistemic uncertainty. Since analytical posteriors are not typically available, one resorts to Markov chain Monte Carlo sampling or approximate variational inference. However, inference needs to be rerun from scratch for each new set of data. This drawback limits the applicability of the Bayesian formulation to real-time settings, e.g., health monitoring of engineered systems, and medical diagnosis. The objective of this paper is to develop a methodology that enables real-time inference by learning the Bayesian inverse map, i.e., the map from data to posteriors. Our approach is as follows. We represent the posterior distribution using a parameterization based on deep neural networks. Next, we learn the network parameters by amortized variational inference method which involves maximizing the expectation of evidence lower bound over all possible datasets compatible with the model. We demonstrate our approach by solving examples a set of benchmark problems from science and engineering. Our results show that the posterior estimates of our approach are in agreement with the corresponding ground truth obtained by Markov chain Monte Carlo. Once trained, our approach provides the posterior parameters of observation just at the cost of a forward pass of the neural network.


Physics-informed Information Field Theory for Modeling Physical Systems with Uncertainty Quantification

arXiv.org Artificial Intelligence

Data-driven approaches coupled with physical knowledge are powerful techniques to model systems. The goal of such models is to efficiently solve for the underlying field by combining measurements with known physical laws. As many systems contain unknown elements, such as missing parameters, noisy data, or incomplete physical laws, this is widely approached as an uncertainty quantification problem. The common techniques to handle all the variables typically depend on the numerical scheme used to approximate the posterior, and it is desirable to have a method which is independent of any such discretization. Information field theory (IFT) provides the tools necessary to perform statistics over fields that are not necessarily Gaussian. We extend IFT to physics-informed IFT (PIFT) by encoding the functional priors with information about the physical laws which describe the field. The posteriors derived from this PIFT remain independent of any numerical scheme and can capture multiple modes, allowing for the solution of problems which are ill-posed. We demonstrate our approach through an analytical example involving the Klein-Gordon equation. We then develop a variant of stochastic gradient Langevin dynamics to draw samples from the joint posterior over the field and model parameters. We apply our method to numerical examples with various degrees of model-form error and to inverse problems involving nonlinear differential equations. As an addendum, the method is equipped with a metric which allows the posterior to automatically quantify model-form uncertainty. Because of this, our numerical experiments show that the method remains robust to even an incorrect representation of the physics given sufficient data. We numerically demonstrate that the method correctly identifies when the physics cannot be trusted, in which case it automatically treats learning the field as a regression problem.


Bayesian Model Averaging for Data Driven Decision Making when Causality is Partially Known

arXiv.org Artificial Intelligence

Probabilistic machine learning models are often insufficient to help with decisions on interventions because those models find correlations - not causal relationships. If observational data is only available and experimentation are infeasible, the correct approach to study the impact of an intervention is to invoke Pearl's causality framework. Even that framework assumes that the underlying causal graph is known, which is seldom the case in practice. When the causal structure is not known, one may use out-of-the-box algorithms to find causal dependencies from observational data. However, there exists no method that also accounts for the decision-maker's prior knowledge when developing the causal structure either. The objective of this paper is to develop rational approaches for making decisions from observational data in the presence of causal graph uncertainty and prior knowledge from the decision-maker. We use ensemble methods like Bayesian Model Averaging (BMA) to infer set of causal graphs that can represent the data generation process. We provide decisions by computing the expected value and risk of potential interventions explicitly. We demonstrate our approach by applying them in different example contexts.


Towards a Theory of Systems Engineering Processes: A Principal-Agent Model of a One-Shot, Shallow Process

arXiv.org Artificial Intelligence

Systems engineering processes coordinate the effort of different individuals to generate a product satisfying certain requirements. As the involved engineers are self-interested agents, the goals at different levels of the systems engineering hierarchy may deviate from the system-level goals which may cause budget and schedule overruns. Therefore, there is a need of a systems engineering theory that accounts for the human behavior in systems design. To this end, the objective of this paper is to develop and analyze a principal-agent model of a one-shot (single iteration), shallow (one level of hierarchy) systems engineering process. We assume that the systems engineer maximizes the expected utility of the system, while the subsystem engineers seek to maximize their expected utilities. Furthermore, the systems engineer is unable to monitor the effort of the subsystem engineer and may not have a complete information about their types or the complexity of the design task. However, the systems engineer can incentivize the subsystem engineers by proposing specific contracts. To obtain an optimal incentive, we pose and solve numerically a bi-level optimization problem. Through extensive simulations, we study the optimal incentives arising from different system-level value functions under various combinations of effort costs, problem-solving skills, and task complexities.


Learning Personalized Thermal Preferences via Bayesian Active Learning with Unimodality Constraints

arXiv.org Machine Learning

Thermal preferences vary from person to person and may change over time. The main objective of this paper is to sequentially pose intelligent queries to occupants in order to optimally learn the indoor air temperature values which maximize their satisfaction. Our central hypothesis is that an occupant's preference relation over indoor air temperature can be described using a scalar function of these temperatures, which we call the "occupant's thermal utility function". Information about an occupant's preference over these temperatures is available to us through their response to thermal preference queries : "prefer warmer," "prefer cooler" and "satisfied" which we interpret as statements about the derivative of their utility function, i.e. the utility function is "increasing", "decreasing" and "constant" respectively. We model this hidden utility function using a Gaussian process prior with built-in unimodality constraint, i.e., the utility function has a unique maximum, and we train this model using Bayesian inference. This permits an expected improvement based selection of next preference query to pose to the occupant, which takes into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling from areas which are likely to offer an improvement over current best observation). We use this framework to sequentially design experiments and illustrate its benefits by showing that it requires drastically fewer observations to learn the maximally preferred temperature values as compared to other methods. This framework is an important step towards the development of intelligent HVAC systems which would be able to respond to occupants' personalized thermal comfort needs. In order to encourage the use of our PE framework and ensure reproducibility in results, we publish an implementation of our work named GPPrefElicit as an open-source package in Python.


Deep active subspaces - a scalable method for high-dimensional uncertainty propagation

arXiv.org Machine Learning

A problem of considerable importance within the field of uncertainty quantification (UQ) is the development of efficient methods for the construction of accurate surrogate models. Such efforts are particularly important to applications constrained by high-dimensional uncertain parameter spaces. The difficulty of accurate surrogate modeling in such systems, is further compounded by data scarcity brought about by the large cost of forward model evaluations. Traditional response surface techniques, such as Gaussian process regression (or Kriging) and polynomial chaos are difficult to scale to high dimensions. To make surrogate modeling tractable in expensive high-dimensional systems, one must resort to dimensionality reduction of the stochastic parameter space. A recent dimensionality reduction technique that has shown great promise is the method of `active subspaces'. The classical formulation of active subspaces, unfortunately, requires gradient information from the forward model - often impossible to obtain. In this work, we present a simple, scalable method for recovering active subspaces in high-dimensional stochastic systems, without gradient-information that relies on a reparameterization of the orthogonal active subspace projection matrix, and couple this formulation with deep neural networks. We demonstrate our approach on synthetic and real world datasets and show favorable predictive comparison to classical active subspaces.