Goto

Collaborating Authors

 fast matrix square root


Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization

Neural Information Processing Systems

Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians N(0,K) or "whitening" a vector b against covariance matrix K. While existing methods typically require O(N^3) computation, we introduce a highly-efficient quadratic-time algorithm for computing K^{1/2}b, K^{-1/2}b, and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically achieves 4 decimal places of accuracy with fewer than 100 MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as 50,000 by 50,000 - well beyond traditional methods - with little approximation error. Applying this increased scalability to variational Gaussian processes, Bayesian optimization, and Gibbs sampling results in more powerful models with higher accuracy. In particular, we perform variational GP inference with up to 10,000 inducing points and perform Gibbs sampling on a 25,000-dimensional problem.


Review for NeurIPS paper: Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization

Neural Information Processing Systems

Summary and Contributions: 3rd EDIT: With the notebook now attached to this submission, I am content with the quality of the empirical evaluation. For double precision, the Cholesky decomposition fails less often whereas MINRES needs more iterations to converge. Since this aspect is neither explored nor even mentioned, I recommend to reject this paper. EDIT: In their rebuttal the authors just brushed over my concern that the number of iterations is limited to J 200. It appears this parameter is far more crucial than the authors are willing to admit and reproducing the experiments turned out to be difficult.


Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization

Neural Information Processing Systems

Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians N(0,K) or "whitening" a vector b against covariance matrix K. While existing methods typically require O(N 3) computation, we introduce a highly-efficient quadratic-time algorithm for computing K {1/2}b, K {-1/2}b, and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically achieves 4 decimal places of accuracy with fewer than 100 MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as 50,000 by 50,000 - well beyond traditional methods - with little approximation error.