Landing Probabilities of Random Walks for Seed-Set Expansion in Hypergraphs

Chien, Eli, Li, Pan, Milenkovic, Olgica

arXiv.org Machine Learning 

Landing Probabilities of Random Walks for Seed-Set Expansion in Hypergraphs Eli Chien Pan Li Olgica Milenkovic Department ECE, UIUC Department ECE, UIUC Department ECE, UIUC Abstract We describe the first known mean-field study of landing probabilities for random walks on hypergraphs. In particular, we examine clique-expansion and tensor methods and evaluate their mean-field characteristics over a class of random hypergraph models for the purpose of seed-set community expansion. We describe parameter regimes in which the two methods outperform each other and propose a hybrid expansion method that uses partial clique-expansion to reduce the projection distortion and low-complexity tensor methods applied directly on the partially expanded hypergraphs. 1 1 Introduction Random walks on graphs are Markov random processes in which given a starting vertex, one moves to a randomly selected neighbor and then repeats the procedure starting from the newly selected vertex [1]. Random walks are used in many graph-based learning algorithms such as PageRank [2] and Label Propagating [3], and they have found a variety of applications in local community detection [4, 5], information retrieval [2] and semi-supervised learning [3]. Random walks are also frequently used to characterize the topological structure of graphs via the hitting time of a vertex from a seed, the commute time between two vertices [6] and the mixing time which also characterizes global graph connectivity [7]. Recently, a new measure of vertex connectivity and similarity, termed a landing probability (LP), was introduced in [8]. A1 Eli Chien and Pan Li contribute equally to this work.Preprint version. LP of a vertex is the probability of a random walk ending at the vertex after making a certain number of steps. Different linear combinations of LPs give rise to different forms of PageRanks (PRs), such as the standard PR [2] and the heat-kernel PR [9], both used for various graph clustering tasks. In particular, Kloumann et al. [8] also initiated the analysis of PRs based on LPs for seed-based community detection. Under the assumption of a generative stochastic block model (SBM) [10] with two blocks, the authors of [8] proved that the empirical average of LPs within the seed community concentrates around a deterministic centroid. Similarly, the empirical averages of LPs outside the seed community also concentrate around another deterministic centroid.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found