Sampling with Minimum Sum of Squared Similarities for Nystrom-Based Large Scale Spectral Clustering
Bouneffouf, Djallel (Canada's Michael Smith Genome Sciences Centre) | Birol, Inanc (Canada's Michael Smith Genome Sciences Centre)
The Nystrom method provides an efficient sampling approach for large scale clustering problems, by generating a low-rank matrix approximation. However, existing sampling methods are limited by accuracy and computing time. This paper proposes an improved Nystrom-based clustering algorithm with a new sampling procedure, Minimum Sum of Squared Similarities (MSSS). Experiments on synthetic and real data sets show that the proposed sampling performs with higher accuracy than existing algorithms, applied to Nystrom-based spectral clustering problems. Furthermore, we provide a theoretical analysis that allows us to define the upper bound of the Frobenius norm error of the MSSS.
- Country:
- North America
- United States > California
- San Francisco County > San Francisco (0.14)
- Orange County > Irvine (0.04)
- Canada > British Columbia
- United States > California
- Europe > Finland
- North Karelia > Joensuu (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America
- Technology: