Variable selection for clustering with Gaussian mixture models: state of the art
Talibi, Abdelghafour, Achchab, Boujemâa, Lasri, Rafik
SAA T Laboratory, University of Abdelmalek Essadi, FPL, Larache Morocco Corresponding author: Abdelghafour Talibi,a.talibi@uhp.ac.ma Abstract The mixture models have become widely used in clustering, given its probabilistic framework in which its based, however, for modern databases that are characterized by their large size, these models behave disappointingly in setting out the model, making essential the selection of relevant variables for this type of clustering. After recalling the basics of clustering based on a model, this article will examine the variable selection methods for model-based clustering, as well as presenting opportunities for improvement of these methods. I INTRODUCTION Clustering aims to classify objects of a population in groups, where the objects in the same group are similar to each other, and the objects in different groups are dissimilar. Unlike the supervised classification where the number of groups is known in advance, at least for a sample, in the case of clustering, it is unknown how many groups and it remains to be estimated. In fact, many fields of research used clustering methods on the data, in order to obtain groups that allow understanding and interpreting the phenomenon studied.
Jan-31-2017
- Country:
- Africa > Middle East
- Morocco (0.24)
- North America > United States
- California > Alameda County > Berkeley (0.04)
- Africa > Middle East
- Genre:
- Research Report (1.00)