The Hidden Convexity of Spectral Clustering

Voss, James, Belkin, Mikhail, Rademacher, Luis

arXiv.org Machine Learning 

Partitioning a dataset into classes based on a similarity between data points, known as cluster analysis, is one of the most basic and practically important problems in data analysis and machine learning. It has a vast array of applications from speech recognition to image analysis to bioinformatics and to data compression. There is an extensive literature on the subject, including a number of different methodologies as well as their various practical and theoretical aspects [11]. In recent years spectral clustering--a class of methods based on the eigenvectors of a certain matrix, typically the graph Laplacian constructed from data--has become a widely used method for cluster analysis. This is due to the simplicity of the algorithm, a number of desirable properties it exhibits and its amenability to theoretical analysis. In its simplest form, spectral bi-partitioning is an attractively straightforward algorithm based on thresholding the second bottom eigenvector of the Laplacian matrix of a graph. However, the more practically significant problem of multiway spectral clustering is considerably more complex. While hierarchical methods based on a sequence of binary splits have been used, the most common approaches use k-means or weighted k-means clustering in the spectral space or related iterative procedures [17, 15, 2, 25].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found