Review for NeurIPS paper: Kernel Methods Through the Roof: Handling Billions of Points Efficiently
–Neural Information Processing Systems
Weaknesses: In my opinion, comparing Nystrom for kernel ridge regression to variational GPs is apples to oranges in a lot of ways that are frankly unfair to variational GPs. In my view, a much more appropriate comparison would be a KeOps based implementation of SGPR or FITC with fixed inducing points. Variational GPs introduce a very large number of parameters in the form of the variational distribution and inducing point locations that require optimization and significantly increase the total amount of time spent in optimization. Methods that train GPs through the marginal likelihood with fixed inducing locations (e.g., as in Nystrom) may have as few as 3 parameters to fit. By contrast, SVGP learns (1) a variational distribution q(u) including a variational covariance matrix, and (2) the inducing point locations.
Neural Information Processing Systems
Jan-27-2025, 07:34:48 GMT
- Technology: