guinness
Implementation and Analysis of GPU Algorithms for Vecchia Approximation
James, Zachary, Guinness, Joseph
Gaussian Processes have become an indispensable part of the spatial statistician's toolbox but are unsuitable for analyzing large dataset because of the significant time and memory needed to fit the associated model exactly. Vecchia Approximation is widely used to reduce the computational complexity and can be calculated with embarrassingly parallel algorithms. While multi-core software has been developed for Vecchia Approximation, such as the GpGp R package, software designed to run on graphics processing units (GPU) is lacking, despite the tremendous success GPUs have had in statistics and machine learning. We compare three different ways to implement Vecchia Approximation on a GPU: two of which are similar to methods used for other Gaussian Process approximations and one that is new. The impact of memory type on performance is investigated and the final method is optimized accordingly. We show that our new method outperforms the other two and then present it in the GpGpU R package. We compare GpGpU to existing multi-core and GPU-accelerated software by fitting Gaussian Process models on various datasets, including a large spatial-temporal dataset of $n>10^6$ points collected from an earth-observing satellite. Our results show that GpGpU achieves faster runtimes and better predictive accuracy.
Scalable Model-Based Gaussian Process Clustering
Chakraborty, Anirban, Chakraborty, Abhisek
Gaussian process is an indispensable tool in clustering functional data, owing to it's flexibility and inherent uncertainty quantification. However, when the functional data is observed over a large grid (say, of length $p$), Gaussian process clustering quickly renders itself infeasible, incurring $O(p^2)$ space complexity and $O(p^3)$ time complexity per iteration; and thus prohibiting it's natural adaptation to large environmental applications. To ensure scalability of Gaussian process clustering in such applications, we propose to embed the popular Vecchia approximation for Gaussian processes at the heart of the clustering task, provide crucial theoretical insights towards algorithmic design, and finally develop a computationally efficient expectation maximization (EM) algorithm. Empirical evidence of the utility of our proposal is provided via simulations and analysis of polar temperature anomaly (\href{https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series}{noaa.gov}) data-sets.
- North America > United States > Texas > Brazos County > College Station (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe (0.04)
- Asia (0.04)
Scalable Gaussian-process regression and variable selection using Vecchia approximations
Cao, Jian, Guinness, Joseph, Genton, Marc G., Katzfuss, Matthias
Gaussian process (GP) regression is a flexible, nonparametric approach to regression that naturally quantifies uncertainty. In many applications, the number of responses and covariates are both large, and a goal is to select covariates that are related to the response. For this setting, we propose a novel, scalable algorithm, coined VGPR, which optimizes a penalized GP log-likelihood based on the Vecchia GP approximation, an ordered conditional approximation from spatial statistics that implies a sparse Cholesky factor of the precision matrix. We traverse the regularization path from strong to weak penalization, sequentially adding candidate covariates based on the gradient of the log-likelihood and deselecting irrelevant covariates via a new quadratic constrained coordinate descent algorithm. We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators. The resulting procedure is scalable to millions of responses and thousands of covariates. Theoretical analysis and numerical studies demonstrate the improved scalability and accuracy relative to existing methods.
- North America > United States > Texas (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
Scalable Bayesian Optimization Using Vecchia Approximations of Gaussian Processes
Jimenez, Felix, Katzfuss, Matthias
Bayesian optimization is a technique for optimizing black-box target functions. At the core of Bayesian optimization is a surrogate model that predicts the output of the target function at previously unseen inputs to facilitate the selection of promising input values. Gaussian processes (GPs) are commonly used as surrogate models but are known to scale poorly with the number of observations. We adapt the Vecchia approximation, a popular GP approximation from spatial statistics, to enable scalable high-dimensional Bayesian optimization. We develop several improvements and extensions, including training warped GPs using mini-batch gradient descent, approximate neighbor search, and selecting multiple input values in parallel. We focus on the use of our warped Vecchia GP in trust-region Bayesian optimization via Thompson sampling. On several test functions and on two reinforcement-learning problems, our methods compared favorably to the state of the art.
- North America > United States > Texas > Brazos County > College Station (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
Scaled Vecchia approximation for fast computer-model emulation
Katzfuss, Matthias, Guinness, Joseph, Lawrence, Earl
Many scientific phenomena are studied using computer experiments consisting of multiple runs of a computer model while varying the input settings. Gaussian processes (GPs) are a popular tool for the analysis of computer experiments, enabling interpolation between input settings, but direct GP inference is computationally infeasible for large datasets. We adapt and extend a powerful class of GP methods from spatial statistics to enable the scalable analysis and emulation of large computer experiments. Specifically, we apply Vecchia's ordered conditional approximation in a transformed input space, with each input scaled according to how strongly it relates to the computer-model response. The scaling is learned from the data, by estimating parameters in the GP covariance function using Fisher scoring. Our methods are highly scalable, enabling estimation, joint prediction and simulation in near-linear time in the number of model runs. In several numerical examples, our approach substantially outperformed existing methods.
- North America > United States > Texas (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
Guinness reinstates Billy 'King of Kong' Mitchell's world records
Billy "Video Game Player of the Century" Mitchell has been vindicated. Today, Guinness World Records reinstated the Donkey Kong and Pac-Man records that were stripped from Mitchell in 2018. Once again, Mitchell holds the first perfect score on Pac-Man and several records for the highest score on Donkey Kong. He has also redeemed recognition as the first player to reach the kill screen on Donkey Kong and the first gamer to score one million points on Donkey Kong. Mitchell, also known as the "King of Kong," had his records expunged by Guinness and Twin Galaxies after an investigation alleged some of his performances on Donkey Kong were not reached on arcade hardware.
Guinness strips Billy 'King of Kong' Mitchell's world records
When Twin Galaxies announced it'd stripped Billy "King of Kong" Mitchell's high scores from its forums yesterday, the gaming record-keeping outfit said it'd notified Guinness World Records of such. Today, Kotaku reports that Guinness will strip all of Mitchell's forged video game high scores including entries for Donkey Kong, Pac-Man and Donkey Kong Jr. from its ledger as well. Guinness used Twin Galaxies as its source of verification, according to Kotaku. The outfit said it will begin looking for the deserving record-holder for the now-vacant Pac-Man high score and perfect score in the next few days, because like Twin Galaxies, Guinness no longer trusts anything that Mitchell has submitted in the past. Thankfully, it looks like we can finally put this whole mess behind us.
Where Technology Meets Storytelling Clayman & Associates Marketing Solutions
The publication The Drum sent out a fascinating story recently. The publication was working on an AI issue, and they wanted to know what would happen if, using IBM Watson, marketers could actually interview David Ogilvy. For you non-marketers out there, Ogilvy is kind of considered the king of marketing. He passed away in 1999 after an illustrious career in advertising. He was the kind of guy AMC's Mad Men was paying tribute to.
- Information Technology (0.56)
- Media (0.38)