Goto

Collaborating Authors

 exact gps




Exact Gaussian Processes on a Million Data Points

Neural Information Processing Systems

Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. By partitioning and distributing kernel matrix multiplies, we demonstrate that an exact GP can be trained on over a million points, a task previously thought to be impossible with current computing hardware. Moreover, our approach is generally applicable, without constraints to grid data or specific kernel classes. Enabled by this scalability, we perform the first-ever comparison of exact GPs against scalable GP approximations on datasets with $10^4 \!-\! 10^6$ data points, showing dramatic performance improvements.



the NLLs in the final version of the paper in addition to reporting averages and standard deviations in all of our other 3 tables by running more trials

Neural Information Processing Systems

We agree with all three reviewers that evaluating the predictive variances is important. Thank you for your comments and suggestions. Finally, we will clarify that SGPR is by (Titsias, 2009) and SVGP is by (Hensman et al., 2013). This has important ramifications, e.g., We were unaware of Nguyen's paper at submission and we will add this discussion to the paper. We note that the precomputation, like CG, can be run to a specified desired tolerance. Hensman et al. (2013) used 1000 inducing points on the massive Airline dataset.


Exact Gaussian Processes on a Million Data Points

Neural Information Processing Systems

Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. By partitioning and distributing kernel matrix multiplies, we demonstrate that an exact GP can be trained on over a million points, a task previously thought to be impossible with current computing hardware. Moreover, our approach is generally applicable, without constraints to grid data or specific kernel classes. Enabled by this scalability, we perform the first-ever comparison of exact GPs against scalable GP approximations on datasets with 10 4 \!-\! 10 6 data points, showing dramatic performance improvements.


Exact Gaussian Processes on a Million Data Points

Wang, Ke, Pleiss, Geoff, Gardner, Jacob, Tyree, Stephen, Weinberger, Kilian Q., Wilson, Andrew Gordon

Neural Information Processing Systems

Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. By partitioning and distributing kernel matrix multiplies, we demonstrate that an exact GP can be trained on over a million points, a task previously thought to be impossible with current computing hardware. Moreover, our approach is generally applicable, without constraints to grid data or specific kernel classes.


Exact Gaussian Processes on a Million Data Points

Wang, Ke Alexander, Pleiss, Geoff, Gardner, Jacob R., Tyree, Stephen, Weinberger, Kilian Q., Wilson, Andrew Gordon

arXiv.org Machine Learning

Gaussian processes (GPs) are flexible models with state-of-the-art performance on many impactful applications. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. By partitioning and distributing kernel matrix multiplies, we demonstrate that an exact GP can be trained on over a million points in 3 days using 8 GPUs and can compute predictive means and variances in under a second using 1 GPU at test time. Moreover, we perform the first-ever comparison of exact GPs against state-of-the-art scalable approximations on large-scale regression datasets with $10^4-10^6$ data points, showing dramatic performance improvements.