Feature Selection For High-Dimensional Clustering
Wasserman, Larry, Azizyan, Martin, Singh, Aarti
There are many methods for feature selection in high-dimensional classification and regression. These methods require assumptions such as sparsity and incoherence. Some methods (Fan and Lv 2008) also assume that relevant variables are detectable through marginal correlations. Given these assumptions, one can prove guarantees for the performance of the method. A similar theory for feature selection in clustering is lacking. There exist a number of methods but they do not come with precise assumptions and guarantees. In this paper we propose a method involving two steps: 1. A screening step to eliminate uninformative features.
Jun-9-2014