Goto

Collaborating Authors

 gp-lvm



Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

We thank the reviewers for their positive feedback on our paper. Rev 2 and 5 asked the difference between LL-LVM and GP-LVM: As noted in lines 65-68 in introduction, both are Gaussian models, but GP-LVM is parameterised by a stationary covariance function rather than by a local neighbour-defined precision matrix which seeks to explicitly preserve local neighbourhood properties. Rev 1 We will add the complexity analysis. An expensive step in the E-step is the inversion of the posterior precision matrices of both x and C (in eq. We will study possible ways to decrease the complexity in future work.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

Summary This paper proposes a maximum penalized likelihood version of the GP-LVM (equation after eq. 4). The penalty term added to the GP-LVM log likelihood is the log joint probability density function of the inputs under a Coulomb repulsive process. Clarity: good Originality: good, to the best of my knowledge Significance: medium (early days, still far from an easy to reimplement approach) Details The optimization depends critically on initialization, and the authors propose a heuristic that relies on using a similarity preserving traditional embedding, as well as a way to initialize the GPs hyperparameters. Obtaining posterior uncertainty is pretty tedious and there isn't a good solution. The authors propose a heuristic.


Hyperboloid GPLVM for Discovering Continuous Hierarchies via Nonparametric Estimation

Watanabe, Koshi, Maeda, Keisuke, Ogawa, Takahiro, Haseyama, Miki

arXiv.org Artificial Intelligence

Dimensionality reduction (DR) offers a useful representation of complex high-dimensional data. Recent DR methods focus on hyperbolic geometry to derive a faithful low-dimensional representation of hierarchical data. However, existing methods are based on neighbor embedding, frequently ruining the continual relation of the hierarchies. This paper presents hyperboloid Gaussian process (GP) latent variable models (hGP-LVMs) to embed high-dimensional hierarchical data with implicit continuity via nonparametric estimation. We adopt generative modeling using the GP, which brings effective hierarchical embedding and executes ill-posed hyperparameter tuning. This paper presents three variants that employ original point, sparse point, and Bayesian estimations. We establish their learning algorithms by incorporating the Riemannian optimization and active approximation scheme of GP-LVM. For Bayesian inference, we further introduce the reparameterization trick to realize Bayesian latent variable learning. In the last part of this paper, we apply hGP-LVMs to several datasets and show their ability to represent high-dimensional hierarchies in low-dimensional spaces.


Bayesian Manifold Learning: The Locally Linear Latent Variable Model

Neural Information Processing Systems

We introduce the Locally Linear Latent Variable Model (LL-LVM), a probabilistic model for non-linear manifold discovery that describes a joint distribution over observations, their manifold coordinates and locally linear maps conditioned on a set of neighbourhood relationships. The model allows straightforward variational optimisation of the posterior distribution on coordinates and locally linear maps from the latent space to the observation space given the data. Thus, the LL-LVM encapsulates the local-geometry preserving intuitions that underlie non-probabilistic methods such as locally linear embedding (LLE). Its probabilistic semantics make it easy to evaluate the quality of hypothesised neighbourhood relationships, select the intrinsic dimensionality of the manifold, construct out-of-sample extensions and to combine the manifold model with additional probabilistic models that capture the structure of coordinates within the manifold.


Probabilistic Curve Learning: Coulomb Repulsion and the Electrostatic Gaussian Process

Neural Information Processing Systems

Learning of low dimensional structure in multidimensional data is a canonical problem in machine learning. One common approach is to suppose that the observed data are close to a lower-dimensional smooth manifold. There are a rich variety of manifold learning methods available, which allow mapping of data points to the manifold. However, there is a clear lack of probabilistic methods that allow learning of the manifold along with the generative distribution of the observed data. The best attempt is the Gaussian process latent variable model (GP-LVM), but identifiability issues lead to poor performance. We solve these issues by proposing a novel Coulomb repulsive process (Corp) for locations of points on the manifold, inspired by physical models of electrostatic interactions among particles. Combining this process with a GP prior for the mapping function yields a novel electrostatic GP (electroGP) process. Focusing on the simple case of a one-dimensional manifold, we develop efficient inference algorithms, and illustrate substantially improved performance in a variety of experiments including filling in missing frames in video.


Clustering based on Mixtures of Sparse Gaussian Processes

Moslehi, Zahra, Mirzaei, Abdolreza, Safayani, Mehran

arXiv.org Artificial Intelligence

Creating low dimensional representations of a high dimensional data set is an important component in many machine learning applications. How to cluster data using their low dimensional embedded space is still a challenging problem in machine learning. In this article, we focus on proposing a joint formulation for both clustering and dimensionality reduction. When a probabilistic model is desired, one possible solution is to use the mixture models in which both cluster indicator and low dimensional space are learned. Our algorithm is based on a mixture of sparse Gaussian processes, which is called Sparse Gaussian Process Mixture Clustering (SGP-MIC). The main advantages to our approach over existing methods are that the probabilistic nature of this model provides more advantages over existing deterministic methods, it is straightforward to construct non-linear generalizations of the model, and applying a sparse model and an efficient variational EM approximation help to speed up the algorithm.


Revisiting Active Sets for Gaussian Process Decoders

Moreno-Muñoz, Pablo, Feldager, Cilie W, Hauberg, Søren

arXiv.org Artificial Intelligence

Decoders built on Gaussian processes (GPs) are enticing due to the marginalisation over the non-linear function space. Such models (also known as GP-LVMs) are often expensive and notoriously difficult to train in practice, but can be scaled using variational inference and inducing points. In this paper, we revisit active set approximations. We develop a new stochastic estimate of the log-marginal likelihood based on recently discovered links to cross-validation, and we propose a computationally efficient approximation thereof. We demonstrate that the resulting stochastic active sets (SAS) approximation significantly improves the robustness of GP decoder training, while reducing computational cost. The SAS-GP obtains more structure in the latent space, scales to many datapoints, and learns better representations than variational autoencoders, which is rarely the case for GP decoders.


Uncertainty Quantification of Darcy Flow through Porous Media using Deep Gaussian Process

Daneshkhah, A., Chatrabgoun, O., Esmaeilbeigi, M., Sedighi, T., Abolfathi, S.

arXiv.org Machine Learning

A computational method based on the non-linear Gaussian process (GP), known as deep Gaussian processes (deep GPs) for uncertainty quantification & propagation in modelling of flow through heterogeneous porous media is presented. The method is also used for reducing dimensionality of model output and consequently emulating highly complex relationship between hydrogeological properties and reduced order fluid velocity field in a tractable manner. Deep GPs are multi-layer hierarchical generalisations of GPs with multiple, infinitely wide hidden layers that are very efficient models for deep learning and modelling of high-dimensional complex systems by tackling the complexity through several hidden layers connected with non-linear mappings. According to this approach, the hydrogeological data is modelled as the output of a multivariate GP whose inputs are governed by another GP such that each single layer is either a standard GP or the Gaussian process latent variable model. A variational approximation framework is used so that the posterior distribution of the model outputs associated to given inputs can be analytically approximated. In contrast to the other dimensionality reduction, methods that do not provide any information about the dimensionality of each hidden layer, the proposed method automatically selects the dimensionality of each hidden layer and it can be used to propagate uncertainty obtained in each layer across the hierarchy. Using this, dimensionality of the full input space consists of both geometrical parameters of modelling domain and stochastic hydrogeological parameters can be simultaneously reduced without the need for any simplifications generally being assumed for stochastic modelling of subsurface flow problems. It allows estimation of the flow statistics with greatly reduced computational efforts compared to other stochastic approaches such as Monte Carlo method.


Mixed Likelihood Gaussian Process Latent Variable Model

Murray, Samuel, Kjellström, Hedvig

arXiv.org Machine Learning

We present the Mixed Likelihood Gaussian process latent variable model (GP-LVM), capable of modeling data with attributes of different types. The standard formulation of GP-LVM assumes that each observation is drawn from a Gaussian distribution, which makes the model unsuited for data with e.g. categorical or nominal attributes. Our model, for which we use a sampling based variational inference, instead assumes a separate likelihood for each observed dimension. This formulation results in more meaningful latent representations, and give better predictive performance for real world data with dimensions of different types.