Going deep in clustering high-dimensional data: deep mixtures of unigrams for uncovering topics in textual data

Anderlucci, Laura, Viroli, Cinzia

arXiv.org Machine Learning 

They can be basically defined as a multi-layer stack of algorithms or modules able to gradually learn a huge number of parameters in an architecture composed by multiple nonlinear transformations (LeCun et al., 2015). Typically, and for historical reasons, a structure for deep learning is identified with advanced neural networks: deep Feed Forward, Recurrent, Auto-encoder, Convolution neural networks are very effective and used algorithms of deep learning (Schmidhuber, 2015). They demonstrated to be particularly successful in supervised classification problems arising in several fields such as image and speech recognition, gene expression data, topic classification. When the aim is uncovering unknown classes in a unsupervised classification perspective, important methods of deep learning have been developed along the lines of mixture modeling, because of their ability to decompose a heterogeneous collection of units into a finite number of subgroups with homogeneous structures (Fraley and Raftery, 2002; McLachlan and Peel, 2000). In this direction, van den Oord and Schrauwen (2014) proposed Multilayer Gaussian Mixture Models for modeling natural images; Tang et al. (2012) defined deep mixture of factor analyzers with a greedy layer-wise learning algorithm able to learn each layer at a time. Viroli and McLachlan (2019) developed a general framework for Deep Gaussian mixture models that generalizes and encompasses the previous strategies and several flexible model-based clustering methods such as mixtures of mixture models (Li, 2005), mixtures of Factor Analyzers (McLachlan et al., 2003), mixtures of factor analyzers with common factor loadings (Baek et al., 2010), heteroscedastic factor mixture analysis (Montanari and Viroli, 2010) and mixtures of factor mixture analyzers introduced by Viroli (2010). A general'take-home-message' coming from the existing deep clustering strategies is that deep methods vs shallow ones appear to be very efficient and powerful tools especially for complex high-dimensional data; on the contrary, for simple and small data structures, a deep learning strategy cannot improve performance of simpler and conventional methods or, to better say, it is like to use a'sledgehammer to crack a nut'. The motivating problem behind this work derives from ticket data (i.e.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found