Operator norm convergence of spectral clustering on level sets
Pelletier, Bruno, Pudlo, Pierre
The aim of data clustering, or unsupervised classification, is to partition a data set into several homogeneous groups relatively separated one from each other with respect to a certain distance or notion of similarity. There exists an extensive literature on clustering methods, and we refer the reader to Anderberg [1973], Hartigan [1975], McLachlan and Peel [2000], Chapter 10 in Duda et al. [2000], and Chapter 14 in Hastie et al. [2001] for general materials on the subject. In particular, popular clustering algorithms, such as Gaussian mixture models or k-means, have proved useful in a number of applications, yet they suffer from some internal and computational limitations. Indeed, the parametric assumption at the core of mixture models may be too stringent, while the standard k-means algorithm fails at identifying complex shaped, possibly non-convex, clusters. The class of spectral clustering algorithms is presently emerging as a promising alternative, showing improved performance over classical clustering algorithms on several benchmark problems and applications; see e.g., Ng et al. [2002], von Luxburg [2007]. An overview of spectral clustering algorithms may be found in von Luxburg [2007], and connections with kernel methods are exposed in Fillipone et al. [2008]. The spectral clustering algorithm amounts at embedding the data into a feature space by using the eigenvectors of the similarity matrix in such a way that the clusters may be separated using simple rules, e.g. a separation by hyperplanes. The core component of the spectral clustering algorithm is therefore the similarity matrix, or certain normalizations of it, generally called graph Laplacian matrices; see Chung [1997]. Graph Laplacian matrices may be viewed as discrete versions of bounded operators between functional spaces.
Feb-11-2010
- Country:
- North America > United States
- New York (0.04)
- District of Columbia > Washington (0.04)
- New Jersey > Mercer County
- Princeton (0.04)
- Europe
- Austria > Vienna (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Switzerland > Basel-City
- Basel (0.04)
- France > Occitanie
- Hérault > Montpellier (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.50)
- Technology: