AITopics

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Modeling & Simulation (0.74)

Neural Information Processing SystemsJan-27-2025, 07:34:48 GMT

Review for NeurIPS paper: Kernel Methods Through the Roof: Handling Billions of Points Efficiently

Weaknesses: In my opinion, comparing Nystrom for kernel ridge regression to variational GPs is apples to oranges in a lot of ways that are frankly unfair to variational GPs. In my view, a much more appropriate comparison would be a KeOps based implementation of SGPR or FITC with fixed inducing points. Variational GPs introduce a very large number of parameters in the form of the variational distribution and inducing point locations that require optimization and significantly increase the total amount of time spent in optimization. Methods that train GPs through the marginal likelihood with fixed inducing locations (e.g., as in Nystrom) may have as few as 3 parameters to fit. By contrast, SVGP learns (1) a variational distribution q(u) including a variational covariance matrix, and (2) the inducing point locations.

neurips paper, point efficiently, point location, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.40)

Neural Information Processing SystemsMar-12-2024, 12:45:12 GMT

Understanding Probabilistic Sparse Gaussian Process Approximations Matthias Bauer † † Department of Engineering, University of Cambridge, Cambridge, UK

Good sparse approximations are essential for practical inference in Gaussian Processes as the computational cost of exact methods is prohibitive for large datasets. The Fully Independent Training Conditional (FITC) and the Variational Free Energy (VFE) approximations are two recent popular methods. Despite superficial similarities, these approximations have surprisingly different theoretical properties and behave differently in practice. We thoroughly investigate the two methods for regression both analytically and through illustrative examples, and draw conclusions to guide practical application.

fitc, marginal likelihood, vfe, (14 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (1.00)
North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Modeling & Simulation (0.74)

Reeb, David, Doerr, Andreas, Gerwinn, Sebastian, Rakitsch, Barbara

Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds

arXiv.org Machine LearningOct-29-2018

Gaussian Processes (GPs) are a generic modelling tool for supervised learning. While they have been successfully applied on large datasets, their use in safety-critical applications is hindered by the lack of good performance guarantees. To this end, we propose a method to learn GPs and their sparse approximations by directly optimizing a PAC-Bayesian bound on their generalization performance, instead of maximizing the marginal likelihood. Besides its theoretical appeal, we find in our evaluation that our learning method is robust and yields significantly better generalization guarantees than other common GP approaches on several regression benchmark datasets.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1810.12263

Country: North America (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bauer, Matthias, van der Wilk, Mark, Rasmussen, Carl Edward

Understanding Probabilistic Sparse Gaussian Process Approximations

arXiv.org Machine LearningMay-30-2017

Good sparse approximations are essential for practical inference in Gaussian Processes as the computational cost of exact methods is prohibitive for large datasets. The Fully Independent Training Conditional (FITC) and the Variational Free Energy (VFE) approximations are two recent popular methods. Despite superficial similarities, these approximations have surprisingly different theoretical properties and behave differently in practice. We thoroughly investigate the two methods for regression both analytically and through illustrative examples, and draw conclusions to guide practical application.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

1606.0482

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Modeling & Simulation (0.74)

Bauer, Matthias, Wilk, Mark van der, Rasmussen, Carl Edward

Understanding Probabilistic Sparse Gaussian Process Approximations

Neural Information Processing SystemsDec-31-2016

Good sparse approximations are essential for practical inference in Gaussian Processes as the computational cost of exact methods is prohibitive for large datasets. The Fully Independent Training Conditional (FITC) and the Variational Free Energy (VFE) approximations are two recent popular methods. Despite superficial similarities, these approximations have surprisingly different theoretical properties and behave differently in practice. We thoroughly investigate the two methods for regression both analytically and through illustrative examples, and draw conclusions to guide practical application.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Modeling & Simulation (0.74)

AAAI ConferencesJul-15-2015

EigenGP: Gaussian Process Models with Adaptive Eigenfunctions

Peng, Hao (Purdue University) | Qi, Yuan (Purdue University)

Gaussian processes (GPs) provide a nonparametric representation of functions. However, classical GP inference suffers from high computational cost for big data. In this paper, we propose a new Bayesian approach, EigenGP, that learns both basis dictionary elements — eigenfunctions of a GP prior — and prior precisions in a sparse finite model. It is well known that, among all orthogonal basis functions, eigenfunctions can provide the most compact representation. Unlike other sparse Bayesian finite models where the basis function has a fixed form, our eigenfunctions live in a reproducing kernel Hilbert space as a finite linear combination of kernel functions. We learn the dictionary elements — eigenfunctions — and the prior precisions over these elements as well as all the other hyperparameters from data by maximizing the model marginal likelihood. We explore computational linear algebra to simplify the gradient computation significantly. Our experimental results demonstrate improved predictive performance of EigenGP over alternative sparse GP methods as well as relevance vector machines.

basis function, eigenfunction, eigengp, (16 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > California (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Wilson, Andrew Gordon, Nickisch, Hannes

Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP)

arXiv.org Machine LearningMar-3-2015

We introduce a new structured kernel interpolation (SKI) framework, which generalises and unifies inducing point methods for scalable Gaussian processes (GPs). SKI methods produce kernel approximations for fast computations through kernel interpolation. The SKI framework clarifies how the quality of an inducing point approach depends on the number of inducing (aka interpolation) points, interpolation strategy, and GP covariance kernel. SKI also provides a mechanism to create new scalable kernel methods, through choosing different kernel interpolation strategies. Using SKI, with local cubic kernel interpolation, we introduce KISS-GP, which is 1) more scalable than inducing point alternatives, 2) naturally enables Kronecker and Toeplitz algebra for substantial additional gains in scalability, without requiring any grid data, and 3) can be used for fast and expressive kernel learning. KISS-GP costs O(n) time and storage for GP inference. We evaluate KISS-GP for kernel matrix approximation, kernel learning, and natural sound modelling.

artificial intelligence, interpolation, machine learning, (18 more...)

1503.01057

Country: North America (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Modeling & Simulation (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Frigola, Roger, Rasmussen, Carl Edward

Integrated Pre-Processing for Bayesian Nonlinear System Identification with Gaussian Processes

arXiv.org Machine LearningSep-17-2013

We introduce GP-FNARX: a new model for nonlinear system identification based on a nonlinear autoregressive exogenous model (NARX) with filtered regressors (F) where the nonlinear regression problem is tackled using sparse Gaussian processes (GP). We integrate data pre-processing with system identification into a fully automated procedure that goes from raw data to an identified model. Both pre-processing parameters and GP hyper-parameters are tuned by maximizing the marginal likelihood of the probabilistic model. We obtain a Bayesian model of the system's dynamics which is able to report its uncertainty in regions where the data is scarce. The automated approach, the modeling of uncertainty and its relatively low computational cost make of GP-FNARX a good candidate for applications in robotics and adaptive control.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1303.2912

Country: Europe (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Chalupka, Krzysztof, Williams, Christopher K. I., Murray, Iain

A Framework for Evaluating Approximation Methods for Gaussian Process Regression

arXiv.org Machine LearningNov-5-2012

Gaussian process (GP) predictors are an important component of many Bayesian approaches to machine learning. However, even a straightforward implementation of Gaussian process regression (GPR) requires O(n^2) space and O(n^3) time for a dataset of n examples. Several approximation methods have been proposed, but there is a lack of understanding of the relative merits of the different approximations, and in what situations they are most useful. We recommend assessing the quality of the predictions obtained as a function of the compute time taken, and comparing to standard baselines (e.g., Subset of Data and FITC). We empirically investigate four different approximation algorithms on four different prediction problems, and make our code available to encourage future comparisons.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1205.6326

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)