Goto

Collaborating Authors

 woh


Landing Probabilities of Random Walks for Seed-Set Expansion in Hypergraphs

arXiv.org Machine Learning

Landing Probabilities of Random Walks for Seed-Set Expansion in Hypergraphs Eli Chien Pan Li Olgica Milenkovic Department ECE, UIUC Department ECE, UIUC Department ECE, UIUC Abstract We describe the first known mean-field study of landing probabilities for random walks on hypergraphs. In particular, we examine clique-expansion and tensor methods and evaluate their mean-field characteristics over a class of random hypergraph models for the purpose of seed-set community expansion. We describe parameter regimes in which the two methods outperform each other and propose a hybrid expansion method that uses partial clique-expansion to reduce the projection distortion and low-complexity tensor methods applied directly on the partially expanded hypergraphs. 1 1 Introduction Random walks on graphs are Markov random processes in which given a starting vertex, one moves to a randomly selected neighbor and then repeats the procedure starting from the newly selected vertex [1]. Random walks are used in many graph-based learning algorithms such as PageRank [2] and Label Propagating [3], and they have found a variety of applications in local community detection [4, 5], information retrieval [2] and semi-supervised learning [3]. Random walks are also frequently used to characterize the topological structure of graphs via the hitting time of a vertex from a seed, the commute time between two vertices [6] and the mixing time which also characterizes global graph connectivity [7]. Recently, a new measure of vertex connectivity and similarity, termed a landing probability (LP), was introduced in [8]. A1 Eli Chien and Pan Li contribute equally to this work.Preprint version. LP of a vertex is the probability of a random walk ending at the vertex after making a certain number of steps. Different linear combinations of LPs give rise to different forms of PageRanks (PRs), such as the standard PR [2] and the heat-kernel PR [9], both used for various graph clustering tasks. In particular, Kloumann et al. [8] also initiated the analysis of PRs based on LPs for seed-based community detection. Under the assumption of a generative stochastic block model (SBM) [10] with two blocks, the authors of [8] proved that the empirical average of LPs within the seed community concentrates around a deterministic centroid. Similarly, the empirical averages of LPs outside the seed community also concentrate around another deterministic centroid.