Scalable unsupervised feature selection via weight stability
Zhang, Xudong, de Amorim, Renato Cordeiro
–arXiv.org Artificial Intelligence
Unsupervised feature selection is critical for improving clustering performance in high-dimensional data, where irrelevant features can obscure meaningful structure. In this work, we introduce the Minkowski weighted $k$-means++, a novel initialisation strategy for the Minkowski Weighted $k$-means. Our initialisation selects centroids probabilistically using feature relevance estimates derived from the data itself. Building on this, we propose two new feature selection algorithms, FS-MWK++, which aggregates feature weights across a range of Minkowski exponents to identify stable and informative features, and SFS-MWK++, a scalable variant based on subsampling. We support our approach with a theoretical guarantee under mild assumptions and extensive experiments showing that our methods consistently outperform existing alternatives. Our software can be found at https://github.com/xzhang4-ops1/FSMWK.
arXiv.org Artificial Intelligence
Jun-16-2025
- Country:
- Europe > Poland
- Masovia Province > Warsaw (0.04)
- North America > United States
- California (0.04)
- Europe > Poland
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Technology: