mc approximation
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Rényi Divergence Variational Inference Richard E. Turner University of Cambridge University of Cambridge Cambridge, CB2 1PZ, UK Cambridge, CB2 1PZ, UK yl494@cam.ac.uk ret26@cam.ac.uk
This new family of variational methods unifies a number of existing approaches, and enables a smooth interpolation from the evidence lower-bound to the log (marginal) likelihood that is controlled by the value of α that parametrises the divergence. The reparameterization trick, Monte Carlo approximation and stochastic optimisation methods are deployed to obtain a tractable and unified framework for optimisation. We further consider negative α values and propose a novel variational inference method as a new special case in the proposed framework. Experiments on Bayesian neural networks and variational auto-encoders demonstrate the wide applicability of the VR bound.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (1.00)
- Asia > Middle East > Jordan (0.05)
- North America > United States > New York (0.04)
- (2 more...)
Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks
Bhouri, Mohamed Aziz, Joly, Michael, Yu, Robert, Sarkar, Soumalya, Perdikaris, Paris
Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with re-parameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained multi-fidelity optimization task involving shape optimization of rotor blades in turbo-machinery.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (4 more...)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Binary classification based Monte Carlo simulation
Argouarc'h, Elouan, Desbouvries, François
Acceptance-rejection (AR), Independent Metropolis Hastings (IMH) or importance sampling (IS) Monte Carlo (MC) simulation algorithms all involve computing ratios of probability density functions (pdfs). On the other hand, classifiers discriminate labeled samples produced by a mixture of two distributions and can be used for approximating the ratio of the two corresponding pdfs.This bridge between simulation and classification enables us to propose pdf-free versions of pdf-ratio-based simulation algorithms, where the ratio is replaced by a surrogate function computed via a classifier. From a probabilistic modeling perspective, our procedure involves a structured energy based model which can easily be trained and is compatible with the classical samplers.
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Hudson County > Secaucus (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Discretely Indexed Flows
Argouarc'h, Elouan, Desbouvries, François, Barat, Eric, Kawasaki, Eiji, Dautremer, Thomas
In this paper we propose Discretely Indexed flows (DIF) as a new tool for solving variational estimation problems. Roughly speaking, DIF are built as an extension of Normalizing Flows (NF), in which the deterministic transport becomes stochastic, and more precisely discretely indexed. Due to the discrete nature of the underlying additional latent variable, DIF inherit the good computational behavior of NF: they benefit from both a tractable density as well as a straightforward sampling scheme, and can thus be used for the dual problems of Variational Inference (VI) and of Variational density estimation (VDE). On the other hand, DIF can also be understood as an extension of mixture density models, in which the constant mixture weights are replaced by flexible functions. As a consequence, DIF are better suited for capturing distributions with discontinuities, sharp edges and fine details, which is a main advantage of this construction. Finally we propose a methodology for constructiong DIF in practice, and see that DIF can be sequentially cascaded, and cascaded with NF.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- (2 more...)
A Class of Two-Timescale Stochastic EM Algorithms for Nonconvex Latent Variable Models
Expectation-Maximization (EM) algorithm is a popular choice for learning latent variable models. Variants of the EM have been initially introduced by Neal and Hinton (1998), using incremental updates to scale to large datasets, and by Wei and Tanner (1990); Delyon et al. (1999), using Monte Carlo (MC) approximations to bypass the intractable conditional expectation of the latent data for most nonconvex models. In this paper, we propose a general class of methods called Two-Timescale EM Methods based on a two-stage approach of stochastic updates to tackle an essential nonconvex optimization task for latent variable models. We motivate the choice of a double dynamic by invoking the variance reduction virtue of each stage of the method on both sources of noise: the index sampling for the incremental update and the MC approximation. We establish finite-time and global convergence bounds for nonconvex objective functions. Numerical applications on various models such as deformable template for image analysis or nonlinear models for pharmacokinetics are also presented to illustrate our findings.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Washington > King County > Bellevue (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (5 more...)
Handling missing data in model-based clustering
Serafini, Alessio, Murphy, Thomas Brendan, Scrucca, Luca
Gaussian Mixture models (GMMs) are a powerful tool for clustering, classification and density estimation when clustering structures are embedded in the data. The presence of missing values can largely impact the GMMs estimation process, thus handling missing data turns out to be a crucial point in clustering, classification and density estimation. Several techniques have been developed to impute the missing values before model estimation. Among these, multiple imputation is a simple and useful general approach to handle missing data. In this paper we propose two different methods to fit Gaussian mixtures in the presence of missing data. Both methods use a variant of the Monte Carlo Expectation-Maximisation (MCEM) algorithm for data augmentation. Thus, multiple imputations are performed during the E-step, followed by the standard M-step for a given eigen-decomposed component-covariance matrix. We show that the proposed methods outperform the multiple imputation approach, both in terms of clusters identification and density estimation.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.05)
- North America > United States > New York (0.04)
- (3 more...)
- Information Technology > Data Science > Data Quality (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.64)
Variational Inference with Vine Copulas: An efficient Approach for Bayesian Computer Model Calibration
Kejzlar, Vojtech, Maiti, Tapabrata
The ever-growing access to high performance computing in scientific communities has enabled development of complex computer models in fields such as nuclear physics, climatology, and engineering that produce massive amounts of data. These models need real-time calibration with quantified uncertainties. Bayesian methodology combined with Gaussian process modeling has been heavily utilized for calibration of computer models due to its natural way to account for various sources of uncertainty; see Higdon et al. (2015), and King et al. (2019) for examples in nuclear physics, Sexton et al. (2012) and Pollard et al. (2016) for examples in climatology, and Lawrence et al. (2010), Plumlee et al. (2016) and Zhang et al. (2019) for applications in engineering, astrophysics, and medicine. The original framework for Bayesian calibration of computer models was developed by Kennedy and O'Hagan (2001) with extensions provided by Higdon et al. (2005, 2008); Bayarri et al. (2007); Plumlee (2017, 2019), and Gu and Wang (2018), to name a few. Despite its popularity, however, Bayesian calibration becomes infeasible in big-data scenarios with complex and many-parameter models because it relies on Markov chain Monte Carlo (MCMC) algorithms to approximate posterior densities. This text presents a scalable and statistically principled approach to Bayesian calibration of computer models. We offer an alternative approximation to posterior densities using variational Bayesian inference (VBI), which originated as a machine learning algorithm that approximates a target density through optimization. Statisticians and computer scientists (starting with Peterson and Anderson (1987); Jordan et al. (1999)) have been widely using variational techniques because they tend to be faster and easier to scale to massive datasets. Moreover, the recently published frequentist consistency of variational Bayes by Wang and Blei (2018) established VBI as a theoretically valid procedure.
- Asia > Middle East > Jordan (0.24)
- North America > United States > Virginia > Arlington County > Arlington (0.14)
- North America > United States > Michigan (0.04)
- (3 more...)
Rényi Divergence Variational Inference
Li, Yingzhen, Turner, Richard E.
This paper introduces the variational Rényi bound (VR) that extends traditional variational inference to Rényi's alpha-divergences. This new family of variational methods unifies a number of existing approaches, and enables a smooth interpolation from the evidence lower-bound to the log (marginal) likelihood that is controlled by the value of alpha that parametrises the divergence. The reparameterization trick, Monte Carlo approximation and stochastic optimisation methods are deployed to obtain a tractable and unified framework for optimisation. We further consider negative alpha values and propose a novel variational inference method as a new special case in the proposed framework. Experiments on Bayesian neural networks and variational auto-encoders demonstrate the wide applicability of the VR bound.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Middle East > Jordan (0.05)
- North America > United States > New York (0.04)
- (2 more...)
R\'enyi Divergence Variational Inference
Li, Yingzhen, Turner, Richard E.
This paper introduces the variational R\'enyi bound (VR) that extends traditional variational inference to R\'enyi's alpha-divergences. This new family of variational methods unifies a number of existing approaches, and enables a smooth interpolation from the evidence lower-bound to the log (marginal) likelihood that is controlled by the value of alpha that parametrises the divergence. The reparameterization trick, Monte Carlo approximation and stochastic optimisation methods are deployed to obtain a tractable and unified framework for optimisation. We further consider negative alpha values and propose a novel variational inference method as a new special case in the proposed framework. Experiments on Bayesian neural networks and variational auto-encoders demonstrate the wide applicability of the VR bound.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- (2 more...)