Benchmarking of Clustering Validity Measures Revisited
Simpson, Connor, Campello, Ricardo J. G. B., Stojanovski, Elizabeth
Clustering is an unsupervised learning technique that aims to identify patterns that consist of similar or interrelated observations within data [39, 87]. Many existing clustering algorithms are often categorised into three primary groups [39, 82]: partitioning algorithms such as K-Means [39] and Spectral Clustering [88], hierarchical algorithms such as Single Linkage [39] and HDBSCAN* [7, 8], and soft (fuzzy or probabilistic) algorithms such as Fuzzy c-Means (FCM) [4] and Expectation Maximisation with Gaussian Mixture Models (EM-GMM) [20]. Partitioning clustering algorithms partition data into a given number of k clusters, while hierarchical clustering algorithms produce a sequence of nested partitions with incrementally varying numbers of clusters. Soft clustering algorithms are similar to partitioning techniques except that each data observation is assigned a degree of membership or probability to each cluster, rather than a full assignment to a single cluster. It is worth mentioning that within the aforementioned categories there are clustering algorithms that may not necessarily assign all observations to clusters, due to outlier trimming or noise detection. Two examples of such algorithms are trimmed K-means [14] and the previously mentioned HDBSCAN*, each of which may produce solutions where not all observations are assigned to clusters. Clustering validation or validity is an important step of the clustering process irrespective of the algorithm used [39, 25], as it is crucial to determine the best produced partition(s) and number of clusters within the data [23].
Nov-11-2025
- Country:
- Europe
- Austria > Vienna (0.14)
- Denmark > Southern Denmark (0.04)
- Germany > North Rhine-Westphalia
- Upper Bavaria > Munich (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States
- Massachusetts > Suffolk County
- Boston (0.04)
- New Jersey > Hudson County
- Hoboken (0.04)
- New York > New York County
- New York City (0.04)
- Massachusetts > Suffolk County
- Europe
- Genre:
- Research Report > New Finding (1.00)
- Technology: