Achieving Exact Cluster Recovery Threshold via Semidefinite Programming

Hajek, Bruce, Wu, Yihong, Xu, Jiaming

arXiv.org Machine Learning 

The community detection problem refers to finding the underlying communities within a network using only the knowledge of the network topology [16, 31]. This paper considers the following probabilistic model for generating a network with underlying community structures: Suppose that out of a total of n vertices, rK of them are partitioned into r clusters of size K, and the remaining n rK vertices do not belong to any clusters (called outlier vertices); a random graph G is generated based on the cluster structure, where each pair of vertices is connected independently with probability p if they are in the same cluster or q otherwise. In particular, an outlier vertex is connected to any other vertex with probability q. This random graph ensemble is known as the planted cluster model [10] with parameters n, r, K N and p, q [0, 1] such that n rK. In particular, we call p and q the in-cluster and cross-cluster edge density, respectively. The planted cluster model encompasses several classical planted random graph models including planted clique [5], planted coloring [4], planted dense subgraph [6], planted partition [11], and the stochastic block model [23], which have been widely used for studying the community detection and graph partitioning problem (see, e.g., [26, 12, 30, 9] and the references therein). This paper was accepted to IEEE Transactions on Information Theory on January 3, 2016. The material in this paper was presented in part at the IEEE International Symposium on Information Theory, Hong Kong, June, 2015 [20].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found