Perturb and Combine to Identify Influential Spreaders in Real-World Networks
Tixier, Antoine J. -P., Rossi, Maria-Evgenia G., Malliaros, Fragkiskos D., Read, Jesse, Vazirgiannis, Michalis
Recent research has shown that graph degeneracy algorithms, which decompose a network into a hierarchy of nested subgraphs of decreasing size and increasing density, are very effective at detecting the good spreaders in a network. However, it is also known that degeneracy-based decompositions of a graph are unstable to small perturbations of the network structure. In Machine Learning, the performance of unstable classification and regression methods, such as fully-grown decision trees, can be greatly improved by using Perturb and Combine (P&C) strategies such as bagging (bootstrap aggregating). Therefore, we propose a P&C procedure for networks that (1) creates many perturbed versions of a given graph, (2) applies a node scoring function separately to each graph (such as a degeneracy-based one), and (3) combines the results. We conduct real-world experiments on the tasks of identifying influential spreaders in large social networks, and influential words (keywords) in small word co-occurrence networks. We use the k-core, generalized k-core, and PageRank algorithms as our vertex scoring functions. In each case, using the aggregated scores brings significant improvements compared to using the scores computed on the original graphs. Finally, a bias-variance analysis suggests that our P&C procedure works mainly by reducing bias, and that therefore, it should be capable of improving the performance of all vertex scoring functions, not only unstable ones.
Sep-4-2018
- Country:
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East
- Genre:
- Research Report (0.50)
- Industry:
- Government (0.46)
- Information Technology (0.49)
- Technology: