mlmc
Multilevel neural simulation-based inference
Hikida, Yuga, Bharti, Ayush, Jeffrey, Niall, Briol, François-Xavier
Neural simulation-based inference (SBI) is a popular set of methods for Bayesian inference when models are only available in the form of a simulator. These methods are widely used in the sciences and engineering, where writing down a likelihood can be significantly more challenging than constructing a simulator. However, the performance of neural SBI can suffer when simulators are computationally expensive, thereby limiting the number of simulations that can be performed. In this paper, we propose a novel approach to neural SBI which leverages multilevel Monte Carlo techniques for settings where several simulators of varying cost and fidelity are available. We demonstrate through both theoretical analysis and extensive experiments that our method can significantly enhance the accuracy of SBI methods given a fixed computational budget.
- North America > United States (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Finland (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Nested Expectations with Kernel Quadrature
Chen, Zonghao, Naslidnyk, Masha, Briol, François-Xavier
This paper considers the challenging computational task of estimating nested expectations. Existing algorithms, such as nested Monte Carlo or multilevel Monte Carlo, are known to be consistent but require a large number of samples at both inner and outer levels to converge. Instead, we propose a novel estimator consisting of nested kernel quadrature estimators and we prove that it has a faster convergence rate than all baseline methods when the integrands have sufficient smoothness. We then demonstrate empirically that our proposed method does indeed require fewer samples to estimate nested expectations on real-world applications including Bayesian optimisation, option pricing, and health economics.
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- (2 more...)
MLMC: Interactive multi-label multi-classifier evaluation without confusion matrices
Doknic, Aleksandar, Möller, Torsten
Machine learning-based classifiers are commonly evaluated by metrics like accuracy, but deeper analysis is required to understand their strengths and weaknesses. MLMC is a visual exploration tool that tackles the challenge of multi-label classifier comparison and evaluation. It offers a scalable alternative to confusion matrices which are commonly used for such tasks, but don't scale well with a large number of classes or labels. Additionally, MLMC allows users to view classifier performance from an instance perspective, a label perspective, and a classifier perspective. Our user study shows that the techniques implemented by MLMC allow for a powerful multi-label classifier evaluation while preserving user friendliness.
- Europe > Austria > Vienna (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (2 more...)
Bayesian computation with generative diffusion models by Multilevel Monte Carlo
Haji-Ali, Abdul-Lateef, Pereyra, Marcelo, Shaw, Luke, Zygalakis, Konstantinos
Generative diffusion models have recently emerged as a powerful strategy to perform stochastic sampling in Bayesian inverse problems, delivering remarkably accurate solutions for a wide range of challenging applications. However, diffusion models often require a large number of neural function evaluations per sample in order to deliver accurate posterior samples. As a result, using diffusion models as stochastic samplers for Monte Carlo integration in Bayesian computation can be highly computationally expensive. This cost is especially high in large-scale inverse problems such as computational imaging, which rely on large neural networks that are expensive to evaluate. With Bayesian imaging problems in mind, this paper presents a Multilevel Monte Carlo strategy that significantly reduces the cost of Bayesian computation with diffusion models. This is achieved by exploiting cost-accuracy trade-offs inherent to diffusion models to carefully couple models of different levels of accuracy in a manner that significantly reduces the overall cost of the calculation, without reducing the final accuracy. The effectiveness of the proposed Multilevel Monte Carlo approach is demonstrated with three canonical computational imaging problems, where we observe a $4\times$-to-$8\times$ reduction in computational cost compared to conventional Monte Carlo averaging.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain (0.04)
- Europe > United Kingdom (0.04)
Accelerating Look-ahead in Bayesian Optimization: Multilevel Monte Carlo is All you Need
Yang, Shangda, Zankin, Vitaly, Balandat, Maximilian, Scherer, Stefan, Carlberg, Kevin, Walton, Neil, Law, Kody J. H.
We leverage multilevel Monte Carlo (MLMC) to improve the performance of multi-step look-ahead Bayesian optimization (BO) methods that involve nested expectations and maximizations. The complexity rate of naive Monte Carlo degrades for nested operations, whereas MLMC is capable of achieving the canonical Monte Carlo convergence rate for this type of problem, independently of dimension and without any smoothness assumptions. Our theoretical study focuses on the approximation improvements for one- and two-step look-ahead acquisition functions, but, as we discuss, the approach is generalizable in various ways, including beyond the context of BO. Findings are verified numerically and the benefits of MLMC for BO are illustrated on several benchmark examples. Code is available here https://github.com/Shangda-Yang/MLMCBO.
- Europe > United Kingdom (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- Asia > Russia > Siberian Federal District > Novosibirsk Oblast > Novosibirsk (0.04)
On the Parallel Complexity of Multilevel Monte Carlo in Stochastic Gradient Descent
In the stochastic gradient descent (SGD) for sequential simulations such as the neural stochastic differential equations, the Multilevel Monte Carlo (MLMC) method is known to offer better theoretical computational complexity compared to the naive Monte Carlo approach. However, in practice, MLMC scales poorly on massively parallel computing platforms such as modern GPUs, because of its large parallel complexity which is equivalent to that of the naive Monte Carlo method. To cope with this issue, we propose the delayed MLMC gradient estimator that drastically reduces the parallel complexity of MLMC by recycling previously computed gradient components from earlier steps of SGD. The proposed estimator provably reduces the average parallel complexity per iteration at the cost of a slightly worse per-iteration convergence rate. In our numerical experiments, we use an example of deep hedging to demonstrate the superior parallel complexity of our method compared to the standard MLMC in SGD.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Bulgaria (0.04)
Physics-Informed Kriging: A Physics-Informed Gaussian Process Regression Method for Data-Model Convergence
Yang, Xiu, Tartakovsky, Guzel, Tartakovsky, Alexandre
In this work, we propose a new Gaussian process regression (GPR) method: physics-informed Kriging (PhIK). In the standard data-driven Kriging, the unknown function of interest is usually treated as a Gaussian process with assumed stationary covariance with hyperparameters estimated from data. In PhIK, we compute the mean and covariance function from realizations of available stochastic models, e.g., from realizations of governing stochastic partial differential equations solutions. Such a constructed Gaussian process generally is non-stationary, and does not assume a specific form of the covariance function. Our approach avoids the costly optimization step in data-driven GPR methods to identify the hyperparameters. More importantly, we prove that the physical constraints in the form of a deterministic linear operator are guaranteed in the resulting prediction. We also provide an error estimate in preserving the physical constraints when errors are included in the stochastic model realizations. To reduce the computational cost of obtaining stochastic model realizations, we propose a multilevel Monte Carlo estimate of the mean and covariance functions. Further, we present an active learning algorithm that guides the selection of additional observation locations. The efficiency and accuracy of PhIK are demonstrated for reconstructing a partially known modified Branin function and learning a conservative tracer distribution from sparse concentration measurements.