Size Regularized Cut for Data Clustering
Chen, Yixin, Zhang, Ya, Ji, Xiang
–Neural Information Processing Systems
We present a novel spectral clustering method that enables users to incorporate prior knowledge of the size of clusters into the clustering process. The cost function, which is named size regularized cut (SRcut), is defined as the sum of the inter-cluster similarity and a regularization term measuring the relative size of two clusters. Finding a partition of the data set to minimize SRcut is proved to be NPcomplete. An approximation algorithm is proposed to solve a relaxed version of the optimization problem as an eigenvalue problem. Evaluations over different data sets demonstrate that the method is not sensitive to outliers and performs better than normalized cut.
Neural Information Processing Systems
Dec-31-2006
- Genre:
- Research Report (0.47)