The Hidden Convexity of Spectral Clustering

Voss, James, Belkin, Mikhail, Rademacher, Luis

May-4-2016–arXiv.org Machine Learning

Partitioning a dataset into classes based on a similarity between data points, known as cluster analysis, is one of the most basic and practically important problems in data analysis and machine learning. It has a vast array of applications from speech recognition to image analysis to bioinformatics and to data compression. There is an extensive literature on the subject, including a number of different methodologies as well as their various practical and theoretical aspects [11]. In recent years spectral clustering--a class of methods based on the eigenvectors of a certain matrix, typically the graph Laplacian constructed from data--has become a widely used method for cluster analysis. This is due to the simplicity of the algorithm, a number of desirable properties it exhibits and its amenability to theoretical analysis. In its simplest form, spectral bi-partitioning is an attractively straightforward algorithm based on thresholding the second bottom eigenvector of the Laplacian matrix of a graph. However, the more practically significant problem of multiway spectral clustering is considerably more complex. While hierarchical methods based on a sequence of binary splits have been used, the most common approaches use k-means or weighted k-means clustering in the spectral space or related iterative procedures [17, 15, 2, 25].

artificial intelligence, machine learning, spectral, (16 more...)

arXiv.org Machine Learning

May-4-2016

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found