Chang, Kevin Chen-Chuan
From Node Embedding To Community Embedding
Zheng, Vincent W., Cavallari, Sandro, Cai, Hongyun, Chang, Kevin Chen-Chuan, Cambria, Erik
Most of the existing graph embedding methods focus on nodes, which aim to output a vector representation for each node in the graph such that two nodes being "close" on the graph are close too in the low-dimensional space. Despite the success of embedding individual nodes for graph analytics, we notice that an important concept of embedding communities (i.e., groups of nodes) is missing. Embedding communities is useful, not only for supporting various community-level applications, but also to help preserve community structure in graph embedding. In fact, we see community embedding as providing a higher-order proximity to define the node closeness, whereas most of the popular graph embedding methods focus on first-order and/or second-order proximities. To learn the community embedding, we hinge upon the insight that community embedding and node embedding reinforce with each other. As a result, we propose ComEmbed, the first community embedding method, which jointly optimizes the community embedding and node embedding together. We evaluate ComEmbed on real-world data sets. We show it outperforms the state-of-the-art baselines in both tasks of node classification and community prediction.
Active Learning for Graph Embedding
Cai, Hongyun, Zheng, Vincent W., Chang, Kevin Chen-Chuan
Graph embedding provides an efficient solution for graph analysis by converting the graph into a low-dimensional space which preserves the structure information. In contrast to the graph structure data, the i.i.d. node embedding can be processed efficiently in terms of both time and space. Current semi-supervised graph embedding algorithms assume the labelled nodes are given, which may not be always true in the real world. While manually label all training data is inapplicable, how to select the subset of training data to label so as to maximize the graph analysis task performance is of great importance. This motivates our proposed active graph embedding (AGE) framework, in which we design a general active learning query strategy for any semi-supervised graph embedding algorithm. AGE selects the most informative nodes as the training labelled nodes based on the graphical information (i.e., node centrality) as well as the learnt node embedding (i.e., node classification uncertainty and node embedding representativeness). Different query criteria are combined with the time-sensitive parameters which shift the focus from graph based query criteria to embedding based criteria as the learning progresses. Experiments have been conducted on three public data sets and the results verified the effectiveness of each component of our query strategy and the power of combining them using time-sensitive parameters. Our code is available online at: https://github.com/vwz/AGE.
Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding
Liu, Zemin (Zhejiang University) | Zheng, Vincent W. (Advanced Digital Sciences Center) | Zhao, Zhou (Zhejiang University) | Zhu, Fanwei (Zhejiang University City College) | Chang, Kevin Chen-Chuan (University of Illinois at Urbana-Champaign) | Wu, Minghui (Zhejiang University City College) | Ying, Jing (Zhejiang University)
Many real-world networks have a rich collection of objects. The semantics of these objects allows us to capture different classes of proximities, thus enabling an important task of semantic proximity search. As the core of semantic proximity search, we have to measure the proximity on a heterogeneous graph, whose nodes are various types of objects. Most of the existing methods rely on engineering features about the graph structure between two nodes to measure their proximity. With recent development on graph embedding, we see a good chance to avoid feature engineering for semantic proximity search. There is very little work on using graph embedding for semantic proximity search. We also observe that graph embedding methods typically focus on embedding nodes, which is an "indirect'' approach to learn the proximity. Thus, we introduce a new concept of proximity embedding, which directly embeds the network structure between two possibly distant nodes. We also design our proximity embedding, so as to flexibly support both symmetric and asymmetric proximities. Based on the proximity embedding, we can easily estimate the proximity score between two nodes and enable search on the graph. We evaluate our proximity embedding method on three real-world public data sets, and show it outperforms the state-of-the-art baselines.
Cold-Start Heterogeneous-Device Wireless Localization
Zheng, Vincent W. (Advanced Digital Sciences Center) | Cao, Hong (McLaren Applied Technolgoies APAC) | Gao, Shenghua (ShanghaiTech University) | Adhikari, Aditi (Advanced Digital Sciences Center) | Lin, Miao (Institute for Infocomm Research, A*STAR) | Chang, Kevin Chen-Chuan (University of Illinois at Urbana-Champaign)
In this paper, we study a cold-start heterogeneous-devicelocalization problem. This problem is challenging, becauseit results in an extreme inductive transfer learning setting,where there is only source domain data but no target do-main data. This problem is also underexplored. As there is notarget domain data for calibration, we aim to learn a robustfeature representation only from the source domain. There islittle previous work on such a robust feature learning task; besides, the existing robust feature representation propos-als are both heuristic and inexpressive. As our contribution,we for the first time provide a principled and expressive robust feature representation to solve the challenging cold-startheterogeneous-device localization problem. We evaluate ourmodel on two public real-world data sets, and show that itsignificantly outperforms the best baseline by 23.1%–91.3%across four pairs of heterogeneous devices.