Biarchetype analysis: simultaneous learning of observations and features based on extremes

Alcacer, Aleix, Epifanio, Irene, Gual-Arnau, Ximo

arXiv.org Machine Learning 

Cluster analysis (CLA) is one of the most widely used tools in exploratory data analysis. The idea of clustering is to make groups of observations in such a way that each group contains similar observations that are different to those of the rest of the groups. If the data consist of well-separated clusters, appropriate clustering techniques can obtain, on the one hand, the representative of each cluster (the mean or centroid of the cluster for the popular k-means technique), and, on the other hand, the assignations of each observation to one cluster, or a degree of belonging to each cluster for fuzzy clustering techniques. However, CLA is also used as a segmentation technique in the absence of well-separated (clearly differentiated) clusters in data. Many times, data follow a fan-spread pattern, i.e. features vary continuously across observations. The centroids are located in the middle of the data cloud since data points have to be covered in such a way that the distance between them and the assigned centroid is minimized (see [Wu et al., 2016] about the relationship between CLA and set partitioning). In those cases, where data can be viewed as a superposition of various populations, it is of particular interest to use Archetype Analysis (AA) for segmenting [Keller et al., 2019].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found