Size Regularized Cut for Data Clustering

Chen, Yixin, Zhang, Ya, Ji, Xiang

Neural Information Processing Systems 

We present a novel spectral clustering method that enables users to incorporate priorknowledge of the size of clusters into the clustering process. The cost function, which is named size regularized cut (SRcut), is defined as the sum of the inter-cluster similarity and a regularization term measuring therelative size of two clusters. Finding a partition of the data set to minimize SRcut is proved to be NPcomplete. An approximation algorithm isproposed to solve a relaxed version of the optimization problem as an eigenvalue problem. Evaluations over different data sets demonstrate thatthe method is not sensitive to outliers and performs better than normalized cut.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found