Model Based Clustering of High-Dimensional Binary Data
Tang, Yang, Browne, Ryan P., McNicholas, Paul D.
We propose a mixture of latent trait models with common slope parameters (MCLT) for model-based clustering of high-dimensional binary data, a data type for which few established methods exist. Recent work on clustering of binary data, based on a $d$-dimensional Gaussian latent variable, is extended by incorporating common factor analyzers. Accordingly, our approach facilitates a low-dimensional visual representation of the clusters. We extend the model further by the incorporation of random block effects. The dependencies in each block are taken into account through block-specific parameters that are considered to be random variables. A variational approximation to the likelihood is exploited to derive a fast algorithm for determining the model parameters. Our approach is demonstrated on real and simulated data.
Apr-26-2014
- Country:
- North America
- Canada > Ontario
- Wellington County > Guelph (0.14)
- United States (0.93)
- Canada > Ontario
- North America
- Genre:
- Research Report (0.64)
- Industry:
- Government > Regional Government (0.93)