He, Dongxiao
Detect Overlapping Communities via Ranking Node Popularities
Jin, Di (Tianjin University) | Wang, Hongcui (Tianjin University) | Dang, Jianwu (Tianjin University) | He, Dongxiao (Tianjin University) | Zhang, Weixiong (Washington University in St. Louis)
Detection of overlapping communities has drawn much attention lately as they are essential properties of real complex networks. Despite its influence and popularity, the well studied and widely adopted stochastic model has not been made effective for finding overlapping communities. Here we extend the stochastic model method to detection of overlapping communities with the virtue of autonomous determination of the number of communities. Our approach hinges upon the idea of ranking node popularities within communities and using a Bayesian method to shrink communities to optimize an objective function based on the stochastic generative model. We evaluated the novel approach, showing its superior performance over five state-of-the-art methods, on large real networks and synthetic networks with ground-truths of overlapping communities.
A Stochastic Model for Detecting Heterogeneous Link Communities in Complex Networks
He, Dongxiao (Tianjin University) | Liu, Dayou (Jilin University) | Jin, Di (Tianjin University) | Zhang, Weixiong (Washington University in Saint Louis)
Discovery of communities in networks is a fundamental data analysis problem. Most of the existing approaches have focused on discovering communities of nodes, while recent studies have shown great advantages and utilities of the knowledge of communities of links. Stochastic models provides a promising class of techniques for the identification of modular structures, but most stochastic models mainly focus on the detection of node communities rather than link communities. We propose a stochastic model, which not only describes the structure of link communities, but also considers the heterogeneous distribution of community sizes, a property which is often ignored by other models. We then learn the model parameters using a method of maximum likelihood based on an expectation-maximization algorithm. To deal with large complex real networks, we extend the method by a strategy of iterative bipartition. The extended method is not only efficient, but is also able to determine the number of communities for a given network. We test our approach on both synthetic benchmarks and real-world networks including an application to a large biological network, and also compare it with two existing methods. The results demonstrate the superior performance of our approach over the competing methods for detecting link communities.
Modeling with Node Degree Preservation Can Accurately Find Communities
Jin, Di (Tianjin University) | Chen, Zheng (Washington University in St. Louis) | He, Dongxiao (Tianjin University) | Zhang, Weixiong (Washington University in St. Louis and Jianghan University)
An important problem in analyzing complex networks is discovery of modular or community structures embedded in the networks. Although being promising for identifying network communities, the popular stochastic models often do not preserve node degrees, thus reducing their representation power and applicability to real-world networks. Here we address this critical problem. Instead of using a blockmodel, we adopted a random-graph null model to faithfully capture community structures by preserving in the model the expected node degrees. The new model, learned using nonnegative matrix factorization, is more accurate and robust in representing community structures than the existing methods. Our results from extensive experiments on synthetic benchmarks and real-world networks show the superior performance of the new method over the existing methods in detecting both disjoint and overlapping communities.