Vertex nomination schemes for membership prediction

Fishkind, D. E., Lyzinski, V., Pao, H., Chen, L., Priebe, C. E.

arXiv.org Machine Learning 

Suppose that a graph is realized from a stochastic block model where one of the blocks is of interest, but many or all of the vertices' block labels are unobserved. The task is to order the vertices with unobserved block labels into a "nomination list" such that, with high probability, vertices from the interesting block are concentrated near the list's beginning. We propose several vertex nomination schemes. Our basic--but principled--setting and development yields a best nomination scheme (which is a Bayes-Optimal analogue), and also a likelihood maximization nomination scheme that is practical to implement when there are a thousand vertices, and which is empirically near-optimal when the number of vertices is small enough to allow comparison to the best nomination scheme. We then illustrate the robustness of the likelihood maximization nomination scheme to the modeling challenges inherent in real data, using examples which include a social network involving human trafficking, the Enron Graph, a worm brain connectome and a political blog network. In a stochastic block model, the vertices of the graph are partitioned into blocks, and the existence/nonexistence of an edge between any pair of vertices is an independent Bernoulli trial, with the Bernoulli parameter being a function of the block memberships of the pair of vertices. We are concerned here with a graph realized from a stochastic block model such that many or all of the vertices' block labels are hidden (i.e., unobserved). Received August 2014; revised February 2015. Supported in part by Johns Hopkins University Human Language Technology Center of Excellence (JHU HLT COE) and the XDATA program of the Defense Advanced Research Projects Agency (DARPA) administered through Air Force Research Laboratory contract FA8750-12-2-0303.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found