Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming
Dereure, Erwan, Mfoumou, Emmanuel Akame, Holcman, David
The automated identification of clusters or isolated points is a fundamental step in many classification and spatial analysis pipelines [1, 2, 3] to identify structures in unlabeled data. Clustering typically begins by assigning labels to data points, indicating their membership to one or more groups. However, the strategies used to define these groups can vary significantly across clustering methods, depending on the underlying assumptions about data structure, density, or similarity. Clustering and classification algorithms can be broadly categorized into partitioning-based, hierarchical, and density-based methods. Partitioning methods, such as K-means [4, 5], Spectral Clustering [6], and Support Vector Machines (SVMs) [7], divide the data into distinct groups by optimizing specific criteria. K-means partitions data into a fixed number of spherical clusters by minimizing within-cluster variance. Spectral Clustering extends partitioning by leveraging the eigenstructure of similarity graphs to identify clusters with complex, non-convex shapes through an embedding step followed by a partitioning algorithm. Similarly, SVMs perform classification by implicitly mapping data into higher-dimensional feature spaces using the kernel trick, effectively partitioning data through linear separation in that transformed space.
Sep-3-2025
- Country:
- North America > United States
- California (0.04)
- Europe
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Cambridgeshire > Cambridge (0.04)
- France > Île-de-France
- United Kingdom > England
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.40)
- Technology: