Ensemble Clustering for Graphs
Poulin, Valérie, Théberge, François
Many data-sets are relational in nature, describing interactions between entities, such as friendship networks, communications or geographical co-locations. Most networks that arise in nature exhibit complex structure [1, 2] with subsets of vertices densely interconnected relative to the rest of the network, which we call communities or clusters. Binary relational data-sets are typically represented as graphs G (V, E), where vertices v V represent the entities, and edges e E represent the relations between pairs of entities. For analyzing and exploring complex relational data-sets, graph clustering is commonly used. In this paper, we propose ECG (Ensemble Clustering for Graphs), a graph clustering method based on the concept of co-association consensus clustering. We show that this approach identifies very high quality clusters by replicating the study in [3] and comparing ECG against the best performing algorithms. We also demonstrate that ECG is stable despite the fact of being a randomize algorithm and that it reduces significantly the resolution limit problem, yielding a number of clusters very close to the ground truth partition size. Finally, ECG provides information about the strength of the associations between entities which can be used to determine the presence or absence of communities in the network.
Sep-14-2018
- Country:
- North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
- Genre:
- Research Report (0.82)
- Technology: