model simulation
Simulation-Based Fitting of Intractable Models via Sequential Sampling and Local Smoothing
This paper presents a comprehensive algorithm for fitting generative models whose likelihood, moments, and other quantities typically used for inference are not analytically or numerically tractable. The proposed method aims to provide a general solution that requires only limited prior information on the model parameters. The algorithm combines a global search phase, aimed at identifying the region of the solution, with a local search phase that mimics a trust region version of the Fisher scoring algorithm for computing a quasi-likelihood estimator. Comparisons with alternative methods demonstrate the strong performance of the proposed approach. An R package implementing the algorithm is available on CRAN.
A novel sensitivity analysis method for agent-based models stratifies in-silico tumor spheroid simulations
Rohr, Edward H., Nardini, John T.
Agent-based models (ABMs) are widely used in biology to understand how individual actions scale into emergent population behavior. Modelers employ sensitivity analysis (SA) algorithms to quantify input parameters' impact on model outputs, however, it is hard to perform SA for ABMs due to their computational and complex nature. In this work, we develop the Simulate, Summarize, Reduce, Cluster, and Analyze (SSRCA) methodology, a machine-learning based pipeline designed to facilitate SA for ABMs. In particular, SSRCA can achieve the following tasks for ABMS: 1) identify sensitive model parameters, 2) reveal common output model patterns, and 3) determine which input parameter values generate these patterns. We use an example ABM of tumor spheroid growth to showcase how SSRCA provides similar SA results to the popular Sobol' Method while also identifying four common patterns from the ABM and the parameter regions that generate these outputs. This analysis could streamline data-driven tasks, such as parameter estimation, for ABMs by reducing parameter space. While we highlight these results with an ABM on tumor spheroid formation, the SSRCA methodology is broadly applicable to biological ABMs.
Generative weather for improved crop model simulations
Accurate and precise crop yield prediction is invaluable for decision making at both farm levels and regional levels. To make yield prediction, crop models are widely used for their capability to simulate hypothetical scenarios. While accuracy and precision of yield prediction critically depend on weather inputs to simulations, surprisingly little attention has been paid to preparing weather inputs. We propose a new method to construct generative models for long-term weather forecasts and ultimately improve crop yield prediction. We demonstrate use of the method in two representative scenarios -- single-year production of wheat, barley and canola and three-year production using rotations of these crops. Results show significant improvement from the conventional method, measured in terms of mean and standard deviation of prediction errors. Our method outperformed the conventional method in every one of 18 metrics for the first scenario and in 29 out of 36 metrics for the second scenario. For individual crop modellers to start applying the method to their problems, technical details are carefully explained, and all the code, trained PyTorch models, APSIM simulation files and result data are made available.
Fast, accurate and lightweight sequential simulation-based inference using Gaussian locally linear mappings
Hรคggstrรถm, Henrik, Rodrigues, Pedro L. C., Oudoumanessah, Geoffroy, Forbes, Florence, Picchini, Umberto
Bayesian inference for complex models with an intractable likelihood can be tackled using algorithms performing many calls to computer simulators. These approaches are collectively known as "simulation-based inference" (SBI). Recent SBI methods have made use of neural networks (NN) to provide approximate, yet expressive constructs for the unavailable likelihood function and the posterior distribution. However, they do not generally achieve an optimal trade-off between accuracy and computational demand. In this work, we propose an alternative that provides both approximations to the likelihood and the posterior distribution, using structured mixtures of probability distributions. Our approach produces accurate posterior inference when compared to state-of-the-art NN-based SBI methods, while exhibiting a much smaller computational footprint. We illustrate our results on several benchmark models from the SBI literature.
Bayesian score calibration for approximate models
Bon, Joshua J, Warne, David J, Nott, David J, Drovandi, Christopher
Scientists continue to develop increasingly complex mechanistic models to reflect their knowledge more realistically. Statistical inference using these models can be challenging since the corresponding likelihood function is often intractable and model simulation may be computationally burdensome. Fortunately, in many of these situations, it is possible to adopt a surrogate model or approximate likelihood function. It may be convenient to conduct Bayesian inference directly with the surrogate, but this can result in bias and poor uncertainty quantification. In this paper we propose a new method for adjusting approximate posterior samples to reduce bias and produce more accurate uncertainty quantification. We do this by optimizing a transform of the approximate posterior that maximizes a scoring rule. Our approach requires only a (fixed) small number of complex model simulations and is numerically stable. We demonstrate good performance of the new method on several examples of increasing complexity.
Machine Learning based Parameter Sensitivity of Regional Climate Models -- A Case Study of the WRF Model for Heat Extremes over Southeast Australia
Reddy, P. Jyoteeshkumar, Chinta, Sandeep, Matear, Richard, Taylor, John, Baki, Harish, Thatcher, Marcus, Kala, Jatin, Sharples, Jason
Heatwaves and bushfires cause substantial impacts on society and ecosystems across the globe. Accurate information of heat extremes is needed to support the development of actionable mitigation and adaptation strategies. Regional climate models are commonly used to better understand the dynamics of these events. These models have very large input parameter sets, and the parameters within the physics schemes substantially influence the model's performance. However, parameter sensitivity analysis (SA) of regional models for heat extremes is largely unexplored. Here, we focus on the southeast Australian region, one of the global hotspots of heat extremes. In southeast Australia Weather Research and Forecasting (WRF) model is the widely used regional model to simulate extreme weather events across the region. Hence in this study, we focus on the sensitivity of WRF model parameters to surface meteorological variables such as temperature, relative humidity, and wind speed during two extreme heat events over southeast Australia. Due to the presence of multiple parameters and their complex relationship with output variables, a machine learning (ML) surrogate-based global sensitivity analysis method is considered for the SA. The ML surrogate-based Sobol SA is used to identify the sensitivity of 24 adjustable parameters in seven different physics schemes of the WRF model. Results show that out of these 24, only three parameters, namely the scattering tuning parameter, multiplier of saturated soil water content, and profile shape exponent in the momentum diffusivity coefficient, are important for the considered meteorological variables. These SA results are consistent for the two different extreme heat events. Further, we investigated the physical significance of sensitive parameters. This study's results will help in further optimising WRF parameters to improve model simulation.
Misspecification-robust Sequential Neural Likelihood
Kelly, Ryan P., Nott, David J., Frazier, David T., Warne, David J., Drovandi, Chris
Simulation-based inference (SBI) techniques are now an essential tool for the parameter estimation of mechanistic and simulatable models with intractable likelihoods. Statistical approaches to SBI such as approximate Bayesian computation and Bayesian synthetic likelihood have been well studied in the well specified and misspecified settings. However, most implementations are inefficient in that many model simulations are wasted. Neural approaches such as sequential neural likelihood (SNL) have been developed that exploit all model simulations to build a surrogate of the likelihood function. However, SNL approaches have been shown to perform poorly under model misspecification. In this paper, we develop a new method for SNL that is robust to model misspecification and can identify areas where the model is deficient. We demonstrate the usefulness of the new approach on several illustrative examples.
Representation learning for a generalized, quantitative comparison of complex model outputs
Cess, Colin G., Finley, Stacey D.
Computational models are quantitative representations of systems. By analyzing and comparing the outputs of such models, it is possible to gain a better understanding of the system itself. Though as the complexity of model outputs increases, it becomes increasingly difficult to compare simulations to each other. While it is straightforward to only compare a few specific model outputs across multiple simulations, additional useful information can come from comparing model simulations as a whole. However, it is difficult to holistically compare model simulations in an unbiased manner. To address these limitations, we use representation learning to transform model simulations into low-dimensional points, with the neural networks capturing the relationships between the model outputs without the need to manually specify which outputs to focus on. The distance in low-dimensional space acts as a comparison metric, reducing the difference between simulations to a single value. We provide an approach to training neural networks on model simulations and display how the trained networks can then be used to provide a holistic comparison of model outputs. This approach can be applied to a wide range of model types, providing a quantitative method of analyzing the complex outputs of computational models.
Learned Turbulence Modelling with Differentiable Fluid Solvers: Physics-based Loss-functions and Optimisation Horizons
List, Bjรถrn, Chen, Li-Wei, Thuerey, Nils
In this paper, we train turbulence models based on convolutional neural networks. These learned turbulence models improve under-resolved low resolution solutions to the incompressible Navier-Stokes equations at simulation time. Our study involves the development of a differentiable numerical solver that supports the propagation of optimisation gradients through multiple solver steps. The significance of this property is demonstrated by the superior stability and accuracy of those models that unroll more solver steps during training. Furthermore, we introduce loss terms based on turbulence physics that further improve the model accuracy. This approach is applied to three two-dimensional turbulence flow scenarios, a homogeneous decaying turbulence case, a temporally evolving mixing layer, and a spatially evolving mixing layer. Our models achieve significant improvements of long-term a-posteriori statistics when compared to no-model simulations, without requiring these statistics to be directly included in the learning targets. At inference time, our proposed method also gains substantial performance improvements over similarly accurate, purely numerical methods.
Vertical GaN Diode BV Maximization through Rapid TCAD Simulation and ML-enabled Surrogate Model
Lu, Albert, Marshall, Jordan, Wang, Yifan, Xiao, Ming, Zhang, Yuhao, Wong, Hiu Yung
In this paper, two methodologies are used to speed up the maximization of the breakdown volt-age (BV) of a vertical GaN diode that has a theoretical maximum BV of ~2100V. Firstly, we demonstrated a 5X faster accurate simulation method in Technology Computer-Aided-Design (TCAD). This allows us to find 50% more numbers of high BV (>1400V) designs at a given simulation time. Secondly, a machine learning (ML) model is developed using TCAD-generated data and used as a surrogate model for differential evolution optimization. It can inversely design an out-of-the-training-range structure with BV as high as 1887V (89% of the ideal case) compared to ~1100V designed with human domain expertise.