Hierarchical Gaussian Processes with Wasserstein-2 Kernels

Popescu, Sebastian, Sharp, David, Cole, James, Glocker, Ben

arXiv.org Machine Learning 

Deep Gaussian Processes (DGPs) (Damianou and Lawrence, 2013) are a multi-layered generalization of Gaussian Processes (GPs) that inherit the advantages of GPs, namely calibrated predictive uncertainty and data-efficient learning. This makes them attractive in domains where data is sparse, such as in medical imaging or in safety critical applications such as self driving cars. The sequential embedding of the input through stacked layers of GPs solves the issue of having to hand tune kernels for specific tasks and implicitly embeds non-stationarity in the final output. Even though DGPs can be used in conjunction with the inducing point framework introduced in Hensman et al. (2013), this does not entail tractable inference as it is the case with shallow GPs. Recent implementations using stochastic approximate inference techniques have succeeded in using DGPs in medium and large datasets (Bui et al., 2016; Salimbeni and Deisenroth, 2017; Havasi et al., 2018; Yu et al., 2019). In this work we make use of the framework introduced in Salimbeni and Deisenroth (2017). Recent work (Ustyuzhaninov et al., 2019) has questioned the validity of uncertainties present in the hidden layers of DGPs, showing that approximate inference schemes using variational Gaussian distributions result in all but the last GP collapsing to deterministic transformations in the case of noiseless data. Such pathological behaviour should be avoided as it undermines the utility of layered GPs. In this paper we further investigate the status of hidden layer uncertainties in DGP, showing failure cases and we propose a solution by reinterpreting already existing models in Wasserstein-2 space.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found