Performance Analysis of Spectral Clustering on Compressed, Incomplete and Inaccurate Measurements

Hunter, Blake, Strohmer, Thomas

arXiv.org Machine Learning 

Spectral clustering is a tool for extracting meaningful information from data by grouping similar objectsDtogether [1]. The method uses the eigenvector of an adjacency matrix for embedding the data into a space that captures the underlying group structure [2]. High-dimensional signals, magnetic resonance images, and hyperspectral images can be costly to acquire; even simple direct comparisons could be infeasible among such data sets. Our work shows that the meaningful organization extracted from spectral clustering is preserved under the perturbation from making compressed, incomplete and inaccurate measurements. Using bounds on the perturbation of eigenvectors, we establish error bounds of the spectral embedding when matrix completion and compressed sensing measurements are used. Given some error Nǫ in the entries of an affinity matrix A RN N, we show that the space spanned by the first k eigenvector are all within O(Nǫ) of the span of the unperturbed eigenvectors. We prove that the perturbed spectral coordinates are within O(Nǫ)of a unitary transform of the unperturbed coordinates and can give k-means cluster assignments within O(Nǫ) of the unperturbed case. This analysis holds true when the error perturbation in the entries of an affinity matrix |A(i,j) A (i,j)| ǫ is caused from making compressed arXiv:1011.0997v1