computer experiment
Deep Intrinsic Coregionalization Multi-Output Gaussian Process Surrogate with Active Learning
Deep Gaussian Processes (DGPs) are powerful surrogate models known for their flexibility and ability to capture complex functions. However, extending them to multi-output settings remains challenging due to the need for efficient dependency modeling. We propose the Deep Intrinsic Coregionalization Multi-Output Gaussian Process (deepICMGP) surrogate for computer simulation experiments involving multiple outputs, which extends the Intrinsic Coregionalization Model (ICM) by introducing hierarchical coregionalization structures across layers. This enables deepICMGP to effectively model nonlinear and structured dependencies between multiple outputs, addressing key limitations of traditional multi-output GPs. We benchmark deepICMGP against state-of-the-art models, demonstrating its competitive performance. Furthermore, we incorporate active learning strategies into deepICMGP to optimize sequential design tasks, enhancing its ability to efficiently select informative input locations for multi-output systems.
- North America > United States > Michigan (0.40)
- Europe > United Kingdom > England (0.28)
- Aerospace & Defense (0.46)
- Energy > Oil & Gas (0.46)
Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design
Kim, Hwanwoo, Mak, Simon, Schuetz, Ann-Kathrin, Poon, Alan
With advances in scientific computing, computer experiments are increasingly used for optimizing complex systems. However, for modern applications, e.g., the optimization of nuclear physics detectors, each experiment run can require hundreds of CPU hours, making the optimization of its black-box simulator over a high-dimensional space a challenging task. Given limited runs at inputs $\mathbf{x}_1, \cdots, \mathbf{x}_n$, the best solution from these evaluated inputs can be far from optimal, particularly as dimensionality increases. Existing black-box methods, however, largely employ this ''pick-the-winner'' (PW) solution, which leads to mediocre optimization performance. To address this, we propose a new Black-box Optimization via Marginal Means (BOMM) approach. The key idea is a new estimator of a global optimizer $\mathbf{x}^*$ that leverages the so-called marginal mean functions, which can be efficiently inferred with limited runs in high dimensions. Unlike PW, this estimator can select solutions beyond evaluated inputs for improved optimization performance. Assuming the objective function follows a generalized additive model with unknown link function and under mild conditions, we prove that the BOMM estimator not only is consistent for optimization, but also has an optimization rate that tempers the ''curse-of-dimensionality'' faced by existing methods, thus enabling better performance as dimensionality increases. We present a practical framework for implementing BOMM using the transformed additive Gaussian process surrogate model. Finally, we demonstrate the effectiveness of BOMM in numerical experiments and an application on neutrino detector optimization in nuclear physics.
- Transportation > Air (1.00)
- Energy > Oil & Gas > Upstream (0.45)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Effect Decomposition of Functional-Output Computer Experiments via Orthogonal Additive Gaussian Processes
Tan, Yu, Li, Yongxiang, Dai, Xiaowu, Tsui, Kwok-Leung
Functional ANOVA (FANOVA) is a widely used variance-based sensitivity analysis tool. However, studies on functional-output FANOVA remain relatively scarce, especially for black-box computer experiments, which often involve complex and nonlinear functional-output relationships with unknown data distribution. Conventional approaches often rely on predefined basis functions or parametric structures that lack the flexibility to capture complex nonlinear relationships. Additionally, strong assumptions about the underlying data distributions further limit their ability to achieve a data-driven orthogonal effect decomposition. To address these challenges, this study proposes a functional-output orthogonal additive Gaussian process (FOAGP) to efficiently perform the data-driven orthogonal effect decomposition. By enforcing a conditional orthogonality constraint on the separable prior process, the proposed functional-output orthogonal additive kernel enables data-driven orthogonality without requiring prior distributional assumptions. The FOAGP framework also provides analytical formulations for local Sobol' indices and expected conditional variance sensitivity indices, enabling comprehensive sensitivity analysis by capturing both global and local effect significance. Validation through two simulation studies and a real case study on fuselage shape control confirms the model's effectiveness in orthogonal effect decomposition and variance decomposition, demonstrating its practical value in engineering applications.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Texas (0.04)
- (5 more...)
Reviews: Demystifying Black-box Models with Symbolic Metamodels
I am new to the domain of symbolic regression and found the article to constitute a well-written and interesting introduction to it. Yet, I kept wondering to what extent the presented approach can really help interpreting complex black box functions. In the final example, it is clear that the results are fairly simple and interpretable while delivering a moderate loss in prectivity compared to the crude algorithm. But in more generality, I still don't see how combinations of Bessel functions and alike will help most practitioners. Which leads us to a question that to the best of my understanding was somehow underinvestigated here, namely some more systematic approach on how to tune the complexity of the metamodel, and maybe explore the Pareto front of simplicity versus predictivity.
- North America > United States > Wisconsin (0.06)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.06)
Data-Adaptive Dimensional Analysis for Accurate Interpolation and Extrapolation in Computer Experiments
Rodriguez-Arelis, G. Alexi, Welch, William J.
In a wide range of natural phenomena and engineering processes, physical experimentation is resource-intensive or even impossible, motivating the now widespread use of mathematical models implemented as computer codes. This complementary way of doing science has for decades spawned corresponding research in statistical methodologies for the careful design and analysis of computer experiments (DACE, Currin et al., 1991; Sacks et al., 1989). When a complex computer code is expensive to evaluate, DACE replaces the code by a fast statistical model surrogate trained with limited code runs. The surrogate most commonly employed treats the unknown input-output function as a realization of a Gaussian stochastic process (GaSP), also known simply as a Gaussian process (GP). While there are many possible objectives of such an experiment, e.g., optimization or calibration, prediction of the original computer code output by the statistical surrogate underlies tackling the scientific objective. Therefore, obtaining good accuracy at untried values of the input variables is a fundamental goal. Typically, input variables are in a "raw" form, as provided to the code, and the output is similarly as produced by the code or some summary. This article maintains the basic GaSP surrogate paradigm but proposes to improve prediction accuracy by input and output transformations guided by dimensional analysis (DA), which is related to physical units of measurement (and not the number of variables).
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Rational Kriging
Kriging is a technique for multivariate interpolation of arbitrarily scattered data. It is originated from some mining-related applications, which is developed into the field of geostatistics by the pioneering work of Matheron (1963). It has now become a prominent technique for function approximation and uncertainty quantification in spatial statistics (Cressie, 2015), computer experiments (Santner et al., 2003), and machine learning (Rasmussen and Williams, 2006). Kriging can be briefly explained as follows.
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
Design choice and machine learning model performances
Arboretti, Rosa, Ceccato, Riccardo, Pegoraro, Luca, Salmaso, Luigi
An increasing number of publications present the joint application of Design of Experiments (DOE) and machine learning (ML) as a methodology to collect and analyze data on a specific industrial phenomenon. However, the literature shows that the choice of the design for data collection and model for data analysis is often driven by incidental factors, rather than by statistical or algorithmic advantages, thus there is a lack of studies which provide guidelines on what designs and ML models to jointly use for data collection and analysis. This is the first time in the literature that a paper discusses the choice of design in relation to the ML model performances. An extensive study is conducted that considers 12 experimental designs, 7 families of predictive models, 7 test functions that emulate physical processes, and 8 noise settings, both homoscedastic and heteroscedastic. The results of the research can have an immediate impact on the work of practitioners, providing guidelines for practical applications of DOE and ML.
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > Italy (0.04)
- Asia > Japan (0.04)
Entropy-based adaptive design for contour finding and estimating reliability
Cole, D. Austin, Gramacy, Robert B., Warner, James E., Bomarito, Geoffrey F., Leser, Patrick E., Leser, William P.
Computer modeling of physical systems must accommodate uncertainty in materials and loading conditions. This input uncertainty translates into a stochastic response from the model, based on nominal settings of a physical system, even when the simulator is deterministic. In engineering, assessing the reliability of said system can mean guarding against a physical collapse, puncture or failing of electronics. Reliability statistics like failure probability, the probability the response exceeds a threshold, can be calculated with Monte Carlo (MC). While MC produces an asymptotically unbiased estimator (Robert and Casella 2013), it can take thousands or even millions of model evaluations, i.e., great computational expense, to achieve a desired error tolerance. The search for alternatives to direct MC in computer-assisted reliability analysis has become a cottage industry of late. Some approaches seek to gradually reduce the design space for sampling through subset selection (Cannamela et al. 2008; Au and Beck 2001). Importance sampling (IS) focuses MC efforts by biasing sampling toward areas of the design space where failure is probable (Srinivasan 2013), and then re-weights any expectations to correct for that bias asymptotically. Effective IS strategies (Li et al. 2011; Peherstorfer et al. 2018a) aim to generate samples which reduce variance compared to direct MC.
- Europe > Austria > Vienna (0.14)
- North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
- North America > United States > Virginia > Hampton (0.04)
- (3 more...)
Sensitivity Prewarping for Local Surrogate Modeling
Wycoff, Nathan, Binois, Mickaël, Gramacy, Robert B.
In the continual effort to improve product quality and decrease operations costs, computational modeling is increasingly being deployed to determine feasibility of product designs or configurations. Surrogate modeling of these computer experiments via local models, which induce sparsity by only considering short range interactions, can tackle huge analyses of complicated input-output relationships. However, narrowing focus to local scale means that global trends must be re-learned over and over again. In this article, we propose a framework for incorporating information from a global sensitivity analysis into the surrogate model as an input rotation and rescaling preprocessing step. We discuss the relationship between several sensitivity analysis methods based on kernel regression before describing how they give rise to a transformation of the input variables. Specifically, we perform an input warping such that the "warped simulator" is equally sensitive to all input directions, freeing local models to focus on local dynamics. Numerical experiments on observational data and benchmark test functions, including a high-dimensional computer simulator from the automotive industry, provide empirical validation.
- North America > United States > New Jersey > Middlesex County > New Brunswick (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Virginia (0.04)
- (5 more...)
- Research Report (1.00)
- Overview (0.66)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Sequential design of multi-fidelity computer experiments: maximizing the rate of stepwise uncertainty reduction
Stroh, Rémi, Bect, Julien, Demeyer, Séverine, Fischer, Nicolas, Marquis, Damien, Vazquez, Emmanuel
This article deals with the sequential design of experiments for (deterministic or stochastic) multi-fidelity numerical simulators, that is, simulators that offer control over the accuracy of simulation of the physical phenomenon or system under study. Very often, accurate simulations correspond to high computational efforts whereas coarse simulations can be obtained at a smaller cost. In this setting, simulation results obtained at several levels of fidelity can be combined in order to estimate quantities of interest (the optimal value of the output, the probability that the output exceeds a given threshold...) in an efficient manner. To do so, we propose a new Bayesian sequential strategy called Maximal Rate of Stepwise Uncertainty Reduction (MR-SUR), that selects additional simulations to be performed by maximizing the ratio between the expected reduction of uncertainty and the cost of simulation. This generic strategy unifies several existing methods, and provides a principled approach to develop new ones. We assess its performance on several examples, including a computationally intensive problem of fire safety analysis where the quantity of interest is the probability of exceeding a tenability threshold during a building fire.
- North America > United States > New York (0.04)
- Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
- Europe > Switzerland > Geneva > Geneva (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
- Information Technology > Modeling & Simulation (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)