Clustering and Feature Selection using Sparse Principal Component Analysis
Luss, Ronny, d'Aspremont, Alexandre
–arXiv.org Artificial Intelligence
This paper focuses on applications of sparse principal component analysis to clustering and feature selection problems, with a particular focus on gene expression data analysis. Sparse methods have had a significant impact in many areas of statistics, in particular regression and classification (see [CT05], [DT05] and [Vap95] among others). As in these areas, our motivation for developing sparse multivariate visualization tools is the potential of these methods for yielding statistical results that are both more interpretable and more robust than classical analyses, while giving up little statistical efficiency. Principal component analysis (PCA) is a classic tool for analyzing large scale multivariate data. It seeks linear combinations of the data variables (often called factors or principal components) that capture a maximum amount of variance.
arXiv.org Artificial Intelligence
Oct-8-2008
- Country:
- North America > United States
- New Jersey > Mercer County > Princeton (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.82)
- Industry:
- Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)
- Technology: