gp approximation
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (2 more...)
Fast Gaussian Process Approximations for Autocorrelated Data
Chokhachian, Ahmadreza, Katzfuss, Matthias, Ding, Yu
This paper is concerned with the problem of how to speed up computation for Gaussian process models trained on autocorrelated data. The Gaussian process model is a powerful tool commonly used in nonlinear regression applications. Standard regression modeling assumes random samples and an independently, identically distributed noise. Various fast approximations that speed up Gaussian process regression work under this standard setting. But for autocorrelated data, failing to account for autocorrelation leads to a phenomenon known as temporal overfitting that deteriorates model performance on new test instances. To handle autocorrelated data, existing fast Gaussian process approximations have to be modified; one such approach is to segment the originally correlated data points into blocks in which the blocked data are de-correlated. This work explains how to make some of the existing Gaussian process approximations work with blocked data. Numerical experiments across diverse application datasets demonstrate that the proposed approaches can remarkably accelerate computation for Gaussian process regression on autocorrelated data without compromising model prediction performance.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy > Renewable > Wind (0.69)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
The main idea builds upon the inducing-point formalism underpinning most sparse methods for GP inference. As the computational cost of traditional sparse methods in GPs based on inducing points is O(NM 2), where N is the number of observations and M is the number of inducing points, the paper addresses the problem of large-scale inference by making conditional independence assumptions across inducing points. More specifically, the method proposed in the paper can be seen as a modified version of the partially independent conditional (PIC) approach, where not only the latent functions are grouped in blocks but also the inducing points are clustered in blocks (corresponding to those latent functions) and statistical dependences across inducing point blocks are modeled with a tree. These additional independence assumptions make the resulting inference algorithm much more scalable as it only scales (potentially) linearly with the number of observations and the number of inducing points. The method is evaluated on 1D and 2D problems showing that it outperforms standard sparse GP approximations.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (3 more...)
Posterior and Computational Uncertainty in Gaussian Processes
Wenger, Jonathan, Pleiss, Geoff, Pförtner, Marvin, Hennig, Philipp, Cunningham, John P.
Gaussian processes scale prohibitively with the size of the dataset. In response, many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. Therefore in practice, GP models are often as much about the approximation method as they are about the data. Here, we develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended. The most common GP approximations map to an instance in this class, such as methods based on the Cholesky factorization, conjugate gradients, and inducing points. For any method in this class, we prove (i) convergence of its posterior mean in the associated RKHS, (ii) decomposability of its combined posterior covariance into mathematical and computational covariances, and (iii) that the combined variance is a tight worst-case bound for the squared error between the method's posterior mean and the latent function. Finally, we empirically demonstrate the consequences of ignoring computational uncertainty and show how implicitly modeling it improves generalization performance on benchmark datasets.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (3 more...)
Scaled Vecchia approximation for fast computer-model emulation
Katzfuss, Matthias, Guinness, Joseph, Lawrence, Earl
Many scientific phenomena are studied using computer experiments consisting of multiple runs of a computer model while varying the input settings. Gaussian processes (GPs) are a popular tool for the analysis of computer experiments, enabling interpolation between input settings, but direct GP inference is computationally infeasible for large datasets. We adapt and extend a powerful class of GP methods from spatial statistics to enable the scalable analysis and emulation of large computer experiments. Specifically, we apply Vecchia's ordered conditional approximation in a transformed input space, with each input scaled according to how strongly it relates to the computer-model response. The scaling is learned from the data, by estimating parameters in the GP covariance function using Fisher scoring. Our methods are highly scalable, enabling estimation, joint prediction and simulation in near-linear time in the number of model runs. In several numerical examples, our approach substantially outperformed existing methods.
- North America > United States > Texas (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions
Karvonen, Toni, Wynne, George, Tronarp, Filip, Oates, Chris J., Särkkä, Simo
Despite the ubiquity of the Gaussian process regression model, few theoretical results are available that account for the fact that parameters of the covariance kernel typically need to be estimated from the dataset. This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset. Specifically, we consider the scenario where the scale parameter of a Sobolev kernel (such as a Mat\'ern kernel) is estimated by maximum likelihood. We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model in the sense that the model can become "slowly" overconfident at worst, regardless of the difference between the smoothness of the data-generating function and that expected by the model. The analysis is based on a combination of techniques from nonparametric regression and scattered data interpolation. Empirical results are provided in support of the theoretical findings.
- Europe > Finland (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Scalable Variational Gaussian Processes for Crowdsourcing: Glitch Detection in LIGO
Morales-Álvarez, Pablo, Ruiz, Pablo, Coughlin, Scott, Molina, Rafael, Katsaggelos, Aggelos K.
In the last years, crowdsourcing is transforming the way classification training sets are obtained. Instead of relying on a single expert annotator, crowdsourcing shares the labelling effort among a large number of collaborators. For instance, this is being applied to the data acquired by the laureate Laser Interferometer Gravitational Waves Observatory (LIGO), in order to detect glitches which might hinder the identification of true gravitational-waves. The crowdsourcing scenario poses new challenging difficulties, as it deals with different opinions from a heterogeneous group of annotators with unknown degrees of expertise. Probabilistic methods, such as Gaussian Processes (GP), have proven successful in modeling this setting. However, GPs do not scale well to large data sets, which hampers their broad adoption in real practice (in particular at LIGO). This has led to the recent introduction of deep learning based crowdsourcing methods, which have become the state-of-the-art. However, the accurate uncertainty quantification of GPs has been partially sacrificed. This is an important aspect for astrophysicists in LIGO, since a glitch detection system should provide very accurate probability distributions of its predictions. In this work, we leverage the most popular sparse GP approximation to develop a novel GP based crowdsourcing method that factorizes into mini-batches. This makes it able to cope with previously-prohibitive data sets. The approach, which we refer to as Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR), brings back GP-based methods to the state-of-the-art, and excels at uncertainty quantification. SVGPCR is shown to outperform deep learning based methods and previous probabilistic approaches when applied to the LIGO data. Moreover, its behavior and main properties are carefully analyzed in a controlled experiment based on the MNIST data set.
- North America > United States (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- (3 more...)
- Research Report > Experimental Study (0.86)
- Research Report > Strength High (0.68)
- Energy (0.68)
- Government > Regional Government (0.46)
- Education > Educational Setting (0.46)
- Information Technology > Communications > Social Media > Crowdsourcing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Merging MCMC Subposteriors through Gaussian-Process Approximations
Nemeth, Christopher, Sherlock, Chris
Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. Divide-and-conquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate workers. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm targeting the expectation of the posterior density provides a sample from an approximation to the posterior; secondly, evaluating the true posterior at the sampled points leads to an importance sampler that, asymptotically, targets the true posterior expectations; finally, an alternative importance sampler uses the full Gaussian-process distribution of the approximation to the log-posterior density to re-weight any initial sample and provide both an estimate of the posterior expectation and a measure of the uncertainty in it.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)