Goto

Collaborating Authors

 gp surrogate





Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry

arXiv.org Machine Learning

Parametric partial differential equations (PDEs) are fundamental mathematical tools for modeling complex physical systems, yet their numerical evaluation across parameter spaces remains computationally intensive when using conventional high-fidelity solvers. To address this challenge, we propose a novel physical law-corrected prior Gaussian process (LC-prior GP) surrogate modeling framework that effectively integrates data-driven learning with underlying physical constraints to flexibly handle multi-coupled variables defined on complex geometries. The proposed approach leverages proper orthogonal decomposition (POD) to parameterize high-dimensional PDE solutions via their dominant modes and associated coefficients, thereby enabling efficient Gaussian process (GP) surrogate modeling within a reduced-dimensional coefficient space. A key contribution lies in the incorporation of physical laws together with a limited number of parameter samples to correct the GP posterior mean, thus avoiding reliance on computationally expensive numerical solvers. Furthermore, interpolation functions are constructed to describe the mapping from the full parameter space to the physics-based correction term. This mapping is subsequently backpropagated to constrain the original GP surrogate, yielding a more physically consistent conditional prior. To handle irregular geometries, the radial basis function-finite difference (RBF-FD) method is incorporated during training set computation, with its inherent differentiation matrices providing both computational efficiency and numerical accuracy for physical constraint optimization. The effectiveness of the proposed method is demonstrated through numerical experiments involving a reaction-diffusion model, miscible flooding models, and Navier-Stokes equations with multi-physics coupling defined on irregular domains.


A Implementation and additional empirical results

Neural Information Processing Systems

Here we summarize implementation details and experimental results that were removed from the main body of the paper due to space constraints. All of the empirical work in our paper is fully reproducible. Code may be found in our git repository: http://bitbucket.org/gramacylab/tricands . The GP surrogate for the Goldstein-Price and Hartman 6 examples (Section 3 and Appendix A.2, respectively) used the Gramacy [2020], modified to keep track of the number of evaluations. The heteroskedastic GP surrogate used for A TO (Section 3.2) was via Sometimes this resulted in replications in the design.


Advancing calibration for stochastic agent-based models in epidemiology with Stein variational inference and Gaussian process surrogates

arXiv.org Machine Learning

Accurate calibration of stochastic agent-based models (ABMs) in epidemiology is crucial to make them useful in public health policy decisions and interventions. Traditional calibration methods, e.g., Markov Chain Monte Carlo (MCMC), that yield a probability density function for the parameters being calibrated, are often computationally expensive. When applied to ABMs which are highly parametrized, the calibration process becomes computationally infeasible. This paper investigates the utility of Stein Variational Inference (SVI) as an alternative calibration technique for stochastic epidemiological ABMs approximated by Gaussian process (GP) surrogates. SVI leverages gradient information to iteratively update a set of particles in the space of parameters being calibrated, offering potential advantages in scalability and efficiency for high-dimensional ABMs. The ensemble of particles yields a joint probability density function for the parameters and serves as the calibration. We compare the performance of SVI and MCMC in calibrating CityCOVID, a stochastic epidemiological ABM, focusing on predictive accuracy and calibration effectiveness. Our results demonstrate that SVI maintains predictive accuracy and calibration effectiveness comparable to MCMC, making it a viable alternative for complex epidemiological models. We also present the practical challenges of using a gradient-based calibration such as SVI which include careful tuning of hyperparameters and monitoring of the particle dynamics.


Gaussian Process Surrogate Models for Neural Networks

arXiv.org Machine Learning

Not being able to understand and predict the behavior of deep learning systems makes it hard to decide what architecture and algorithm to use for a given problem. In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque. Modeling replaces a complex system with a simpler, more interpretable surrogate. Drawing inspiration from this, we construct a class of surrogate models for neural networks using Gaussian processes. Rather than deriving kernels for infinite neural networks, we learn kernels empirically from the naturalistic behavior of finite neural networks. We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems such as identifying which points most influence the behavior of specific neural networks and predicting which architectures and algorithms will generalize well for specific datasets.


Trajectory-oriented optimization of stochastic epidemiological models

arXiv.org Machine Learning

Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). With the goal of finding not only the input parameter settings but also the random seeds that are consistent with the ground truth, we propose a class of Gaussian process (GP) surrogates along with an optimization strategy based on Thompson sampling. This Trajectory Oriented Optimization (TOO) approach produces actual trajectories close to the empirical observations instead of a set of parameter settings where only the mean simulation behavior matches with the ground truth.


Hybrid Models for Mixed Variables in Bayesian Optimization

arXiv.org Artificial Intelligence

This paper presents a new type of hybrid models for Bayesian optimization (BO) adept at managing mixed variables, encompassing both quantitative (continuous and integer) and qualitative (categorical) types. Our proposed new hybrid models merge Monte Carlo Tree Search structure (MCTS) for categorical variables with Gaussian Processes (GP) for continuous ones. Addressing efficiency in searching phase, we juxtapose the original (frequentist) upper confidence bound tree search (UCTS) and the Bayesian Dirichlet search strategies, showcasing the tree architecture's integration into Bayesian optimization. Central to our innovation in surrogate modeling phase is online kernel selection for mixed-variable BO. Our innovations, including dynamic kernel selection, unique UCTS (hybridM) and Bayesian update strategies (hybridD), position our hybrid models as an advancement in mixed-variable surrogate models. Numerical experiments underscore the hybrid models' superiority, highlighting their potential in Bayesian optimization. Keywords: Gaussian processes, Monte Carlo tree search, categorical variables, online kernel selection. The discussion of different types of encodings can be found in Cerda et al. (2018). 1 Introduction Our motivating problem is to optimize a "black-box" function with "mixed" variables, lacking an analytic expression. "Mixed" signifies the function's input variables comprise both continuous (quantitative) and categorical (qualitative) variables, common in machine learning and scientific computing tasks like performance tuning of mathematical libraries and application codes at runtime and compile-time (Balaprakash et al., 2018). Bayesian optimization (BO) with Gaussian process (GP) surrogate models is a prevalent method for optimizing noisy, expensive black-box functions, primarily designed for continuous-variable functions (Shahriari et al., 2016; Sid-Lakhdar et al., 2020). Extending BO to mixed-variable functions presents theoretical and computational challenges due to variable type differences (Table 1). Continuous variables have uncountably many values with magnitudes and intrinsic ordering, allowing natural gradient definition. In contrast, categorical variables, having finitely many values without intrinsic ordering or magnitude, require encoding in the GP context, potentially inducing discontinuity and degrading GP performance (Luo et al., 2021). The empirical rule of thumb for handling an integer variable (Karlsson et al., 2020) is to treat it as a categorical variable if the number of integer values (i.e., number of categorical values) is small, or as a continuous variable with embedding (a.k.a.


Impact Study of Numerical Discretization Accuracy on Parameter Reconstructions and Model Parameter Distributions

arXiv.org Machine Learning

In optical nano metrology numerical models are used widely for parameter reconstructions. Using the Bayesian target vector optimization method we fit a finite element numerical model to a Grazing Incidence X-Ray fluorescence data set in order to obtain the geometrical parameters of a nano structured line grating. Gaussian process, stochastic machine learning surrogate models, were trained during the reconstruction and afterwards sampled with a Markov chain Monte Carlo sampler to determine the distribution of the reconstructed model parameters. The numerical discretization parameters of the used finite element model impact the numerical discretization error of the forward model. We investigated the impact of the polynomial order of the finite element ansatz functions on the reconstructed parameters as well as on the model parameter distributions. We showed that such a convergence study allows to determine numerical parameters which allows for efficient and accurate reconstruction results.