K-means Derived Unsupervised Feature Selection using Improved ADMM
Sun, Ziheng, Ding, Chris, Fan, Jicong
JOURNAL OF L A T EX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 1 K-means Derived Unsupervised Feature Selection using Improved ADMM Ziheng Sun, Chris Ding, and Jicong Fan Abstract --Feature selection is important for high-dimensional data analysis and is non-trivial in unsupervised learning problems such as dimensionality reduction and clustering. The goal of unsupervised feature selection is finding a subset of features such that the data points from different clusters are well separated. This paper presents a novel method called K-means Derived Unsupervised Feature Selection (K-means UFS). Unlike most existing spectral analysis based unsupervised feature selection methods, we select features using the objective of K-means. We develop an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of our K-means UFS model. Extensive experiments on real datasets show that our K-means UFS is more effective than the baselines in selecting features for clustering. I NTRODUCTION F EA TURE selection aims to select a subset among a large number of features and is particularly useful in dealing with high-dimensional data such as gene data in bioinformatics. The selected features should preserve the most important information of the data for downstream tasks such as classification and clustering. Many unsupervised feature selection methods have been proposed in the past decades.
Nov-19-2024