### Deep Gaussian Mixture Models

Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this work, Deep Gaussian Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. Thus, the deep mixture model consists of a set of nested mixtures of linear models, which globally provide a nonlinear model able to describe the data in a very flexible way. In order to avoid overparameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture thus resulting in deep mixtures of factor analysers.

### A Locally Adaptive Normal Distribution

The multivariate normal density is a monotonic function of the distance to the mean, and its ellipsoidal shape is due to the underlying Euclidean metric. We suggest to replace this metric with a locally adaptive, smoothly changing (Riemannian) metric that favors regions of high local density. The resulting locally adaptive normal distribution (LAND) is a generalization of the normal distribution to the "manifold" setting, where data is assumed to lie near a potentially low-dimensional manifold embedded in $\mathbb{R}^D$. The LAND is parametric, depending only on a mean and a covariance, and is the maximum entropy distribution under the given metric. The underlying metric is, however, non-parametric. We develop a maximum likelihood algorithm to infer the distribution parameters that relies on a combination of gradient descent and Monte Carlo integration. We further extend the LAND to mixture models, and provide the corresponding EM algorithm. We demonstrate the efficiency of the LAND to fit non-trivial probability distributions over both synthetic data, and EEG measurements of human sleep.

### An Application of Reversible-Jump MCMC to Multivariate Spherical Gaussian Mixtures

Applications of Gaussian mixture models occur frequently in the fields of statistics and artificial neural networks. One of the key issues arising from any mixture model application is how to estimate theoptimum number of mixture components. This paper extends the Reversible-Jump Markov Chain Monte Carlo (MCMC) algorithm to the case of multivariate spherical Gaussian mixtures using a hierarchical prior model. Using this method the number of mixture components is no longer fixed but becomes a parameter ofthe model which we shall estimate. The Reversible-Jump MCMC algorithm is capable of moving between parameter subspaces whichcorrespond to models with different numbers of mixture components. As a result a sample from the full joint distribution of all unknown model parameters is generated. The technique is then demonstrated on a simulated example and a well known vowel dataset. 1 Introduction Applications of Gaussian mixture models regularly appear in the neural networks literature. One of their most common roles in the field of neural networks, is in the placement of centres in a radial basis function network.

### Image restoration with generalized Gaussian mixture model patch priors

Patch priors have became an important component of image restoration. A powerful approach in this category of restoration algorithms is the popular Expected Patch Log-likelihood (EPLL) algorithm. EPLL uses a Gaussian mixture model (GMM) prior learned on clean image patches as a way to regularize degraded patches. In this paper, we show that a generalized Gaussian mixture model (GGMM) captures the underlying distribution of patches better than a GMM. Even though GGMM is a powerful prior to combine with EPLL, the non-Gaussianity of its components presents major challenges to be applied to a computationally intensive process of image restoration. Specifically, each patch has to undergo a patch classification step and a shrinkage step. These two steps can be efficiently solved with a GMM prior but are computationally impractical when using a GGMM prior. In this paper, we provide approximations and computational recipes for fast evaluation of these two steps, so that EPLL can embed a GGMM prior on an image with more than tens of thousands of patches. Our main contribution is to analyze the accuracy of our approximations based on thorough theoretical analysis. Our evaluations indicate that the GGMM prior is consistently a better fit for modeling image patch distribution and performs better on average in image denoising task.

### EP-GIG Priors and Applications in Bayesian Sparse Learning

In this paper we propose a novel framework for the construction of sparsity-inducing priors. In particular, we define such priors as a mixture of exponential power distributions with a generalized inverse Gaussian density (EP-GIG). EP-GIG is a variant of generalized hyperbolic distributions, and the special cases include Gaussian scale mixtures and Laplace scale mixtures. Furthermore, Laplace scale mixtures can subserve a Bayesian framework for sparse learning with nonconvex penalization. The densities of EP-GIG can be explicitly expressed. Moreover, the corresponding posterior distribution also follows a generalized inverse Gaussian distribution. These properties lead us to EM algorithms for Bayesian sparse learning. We show that these algorithms bear an interesting resemblance to iteratively re-weighted $\ell_2$ or $\ell_1$ methods. In addition, we present two extensions for grouped variable selection and logistic regression.