Deep Gaussian Mixture Models

arXiv.org Machine Learning

Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this work, Deep Gaussian Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. Thus, the deep mixture model consists of a set of nested mixtures of linear models, which globally provide a nonlinear model able to describe the data in a very flexible way. In order to avoid overparameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture thus resulting in deep mixtures of factor analysers.


A Locally Adaptive Normal Distribution

arXiv.org Machine Learning

The multivariate normal density is a monotonic function of the distance to the mean, and its ellipsoidal shape is due to the underlying Euclidean metric. We suggest to replace this metric with a locally adaptive, smoothly changing (Riemannian) metric that favors regions of high local density. The resulting locally adaptive normal distribution (LAND) is a generalization of the normal distribution to the "manifold" setting, where data is assumed to lie near a potentially low-dimensional manifold embedded in $\mathbb{R}^D$. The LAND is parametric, depending only on a mean and a covariance, and is the maximum entropy distribution under the given metric. The underlying metric is, however, non-parametric. We develop a maximum likelihood algorithm to infer the distribution parameters that relies on a combination of gradient descent and Monte Carlo integration. We further extend the LAND to mixture models, and provide the corresponding EM algorithm. We demonstrate the efficiency of the LAND to fit non-trivial probability distributions over both synthetic data, and EEG measurements of human sleep.


Dirichlet Process Parsimonious Mixtures for clustering

arXiv.org Machine Learning

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models.


An Application of Reversible-Jump MCMC to Multivariate Spherical Gaussian Mixtures

Neural Information Processing Systems

Applications of Gaussian mixture models occur frequently in the fields of statistics and artificial neural networks. One of the key issues arising from any mixture model application is how to estimate theoptimum number of mixture components. This paper extends the Reversible-Jump Markov Chain Monte Carlo (MCMC) algorithm to the case of multivariate spherical Gaussian mixtures using a hierarchical prior model. Using this method the number of mixture components is no longer fixed but becomes a parameter ofthe model which we shall estimate. The Reversible-Jump MCMC algorithm is capable of moving between parameter subspaces whichcorrespond to models with different numbers of mixture components. As a result a sample from the full joint distribution of all unknown model parameters is generated. The technique is then demonstrated on a simulated example and a well known vowel dataset. 1 Introduction Applications of Gaussian mixture models regularly appear in the neural networks literature. One of their most common roles in the field of neural networks, is in the placement of centres in a radial basis function network.


Clustering Documents And Gaussian Data With Dirichlet Process Mixture Models

#artificialintelligence

This article is the fifth part of the tutorial on Clustering with DPMM. In the previous posts we covered in detail the theoretical background of the method and we described its mathematical representationsmu and ways to construct it. In this post we will try to link the theory with the practice by introducing two models DPMM: the Dirichlet Multivariate Normal Mixture Model which can be used to cluster Gaussian data and the Dirichlet-Multinomial Mixture Model which is used to cluster documents. Update: The Datumbox Machine Learning Framework is now open-source and free to download. Check out the package com.datumbox.framework.machinelearning.clustering to see the implementation of Dirichlet Process Mixture Models in Java.