Ensemble Clustering for Graphs

Poulin, Valérie, Théberge, François

arXiv.org Machine Learning 

Many data-sets are relational in nature, describing interactions between entities, such as friendship networks, communications or geographical co-locations. Most networks that arise in nature exhibit complex structure [1, 2] with subsets of vertices densely interconnected relative to the rest of the network, which we call communities or clusters. Binary relational data-sets are typically represented as graphs G (V, E), where vertices v V represent the entities, and edges e E represent the relations between pairs of entities. For analyzing and exploring complex relational data-sets, graph clustering is commonly used. In this paper, we propose ECG (Ensemble Clustering for Graphs), a graph clustering method based on the concept of co-association consensus clustering. We show that this approach identifies very high quality clusters by replicating the study in [3] and comparing ECG against the best performing algorithms. We also demonstrate that ECG is stable despite the fact of being a randomize algorithm and that it reduces significantly the resolution limit problem, yielding a number of clusters very close to the ground truth partition size. Finally, ECG provides information about the strength of the associations between entities which can be used to determine the presence or absence of communities in the network.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found