Collaborating Authors

Clustering, Classification, Discriminant Analysis, and Dimension Reduction via Generalized Hyperbolic Mixtures Machine Learning

A method for dimension reduction with clustering, classification, or discriminant analysis is introduced. This mixture model-based approach is based on fitting generalized hyperbolic mixtures on a reduced subspace within the paradigm of model-based clustering, classification, or discriminant analysis. A reduced subspace of the data is derived by considering the extent to which group means and group covariances vary. The members of the subspace arise through linear combinations of the original data, and are ordered by importance via the associated eigenvalues. The observations can be projected onto the subspace, resulting in a set of variables that captures most of the clustering information available. The use of generalized hyperbolic mixtures gives a robust framework capable of dealing with skewed clusters. Although dimension reduction is increasingly in demand across many application areas, the authors are most familiar with biological applications and so two of the five real data examples are within that sphere. Simulated data are also used for illustration. The approach introduced herein can be considered the most general such approach available, and so we compare results to three special and limiting cases. Comparisons with several well established techniques illustrate its promising performance.

A Mixture of Generalized Hyperbolic Factor Analyzers Machine Learning

Model-based clustering imposes a finite mixture modelling structure on data for clustering. Finite mixture models assume that the population is a convex combination of a finite number of densities, the distribution within each population is a basic assumption of each particular model. Among all distributions that have been tried, the generalized hyperbolic distribution has the advantage that is a generalization of several other methods, such as the Gaussian distribution, the skew t-distribution, etc. With specific parameters, it can represent either a symmetric or a skewed distribution. While its inherent flexibility is an advantage in many ways, it means the estimation of more parameters than its special and limiting cases. The aim of this work is to propose a mixture of generalized hyperbolic factor analyzers to introduce parsimony and extend the method to high dimensional data. This work can be seen as an extension of the mixture of factor analyzers model to generalized hyperbolic mixtures. The performance of our generalized hyperbolic factor analyzers is illustrated on real data, where it performs favourably compared to its Gaussian analogue.


AAAI Conferences

Unlike traditional LASSO enforcing sparsity on the variables, Generalized LASSO (GL) enforces sparsity on a linear transformation of the variables, gaining flexibility and success in many applications. However, many existing GL algorithms do not scale up to high-dimensional problems, and/or only work well for a specific choice of the transformation. We propose an efficient Matching Pursuit Generalized LASSO (MPGL) method, which overcomes these issues, and is guaranteed to converge to a global optimum. We formulate the GL problem as a convex quadratic constrained linear programming (QCLP) problem and tailor-make a cutting plane method. More specifically, our MPGL iteratively activates a subset of nonzero elements of the transformed variables, and solves a subproblem involving only the activated elements thus gaining significant speed-up. Moreover, MPGL is less sensitive to the choice of the trade-off hyper-parameter between data fitting and regularization, and mitigates the long-standing hyper-parameter tuning issue in many existing methods. Experiments demonstrate the superior efficiency and accuracy of the proposed method over the state-of-the-arts in both classification and image processing tasks.

Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models

Neural Information Processing Systems

Hyperbolic embeddings achieve excellent performance when embedding hierarchical data structures like synonym or type hierarchies, but they can be limited by numerical error when ordinary floating-point numbers are used to represent points in hyperbolic space. Standard models such as the Poincar{\'e} disk and the Lorentz model have unbounded numerical error as points get far from the origin. To address this, we propose a new model which uses an integer-based tiling to represent \emph{any} point in hyperbolic space with provably bounded numerical error. This allows us to learn high-precision embeddings without using BigFloats, and enables us to store the resulting embeddings with fewer bits. We evaluate our tiling-based model empirically, and show that it can both compress hyperbolic embeddings (down to $2\%$ of a Poincar{\'e} embedding on WordNet Nouns) and learn more accurate embeddings on real-world datasets.

Hyperbolic Neural Networks

Neural Information Processing Systems

Hyperbolic spaces have recently gained momentum in the context of machine learning due to their high capacity and tree-likeliness properties. However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, firstly because of the absence of corresponding hyperbolic neural network layers. As a result, we derive hyperbolic versions of important deep learning tools: multinomial logistic regression, feed-forward and recurrent neural networks. This allows to embed sequential data and perform classification in the hyperbolic space. Empirically, we show that, even if hyperbolic optimization tools are limited, hyperbolic sentence embeddings either outperform or are on par with their Euclidean variants on textual entailment and noisy-prefix recognition tasks.