Due to the shallow structure, classic graph neural networks (GNNs) failed in modelling high-order graph structures that deliver critical insights of task relevant relations. The negligence of those insights lead to insufficient distillation of collaborative signals in recommender systems. In this paper, we propose PEAGNN, a unified GNN framework tailored for recommendation tasks, which is capable of exploiting the rich semantics in metapaths. PEAGNN trains multilayer GNNs to perform metapath-aware information aggregation on collaborative subgraphs, $h$-hop subgraphs around the target user-item pairs. After the attentive fusion of aggregated information from different metapaths, a graph-level representation is then extracted for matching score prediction. To leverage the local structure of collaborative subgraphs, we present entity-awareness that regularizes node embedding with the presence of features in a contrastive manner. Moreover, PEAGNN is compatible with the mainstream GNN structures such as GCN, GAT and GraphSage. The empirical analysis on three public datasets demonstrate that our model outperforms or is at least on par with other competitive baselines. Further analysis indicates that trained PEAGNN automatically derives meaningful metapath combinations from the given metapaths.
Classification and regression in which the inputs are graphs of arbitrary size and shape have been paid attention in various fields such as computational chemistry and bioinformatics. Subgraph indicators are often used as the most fundamental features, but the number of possible subgraph patterns are intractably large due to the combinatorial explosion. We propose a novel efficient algorithm to jointly learn relevant subgraph patterns and nonlinear models of their indicators. Previous methods for such joint learning of subgraph features and models are based on search for single best subgraph features with specific pruning and boosting procedures of adding their indicators one by one, which result in linear models of subgraph indicators. In contrast, the proposed approach is based on directly learning regression trees for graph inputs using a newly derived bound of the total sum of squares for data partitions by a given subgraph feature, and thus can learn nonlinear models through standard gradient boosting. An illustrative example we call the Graph-XOR problem to consider nonlinearity, numerical experiments with real datasets, and scalability comparisons to naive approaches using explicit pattern enumeration are also presented.
Subgraph matching is the problem of determining the presence of a given query graph in a large target graph. Despite being an NPcomplete problem, the subgraph matching problem is crucial in domains ranging from network science and database systems to biochemistry and cognitive science. However, existing techniques based on combinatorial matching and integer programming cannot handle matching problems with both large target and query graphs. Here we propose NeuroMatch, an accurate, efficient, and robust neural approach to subgraph matching. Trained to capture geometric constraints corresponding to subgraph relations, NeuroMatch then efficiently performs subgraph matching directly in the embedding space. Experiments demonstrate that NeuroMatch is 100x faster than existing combinatorial approaches and 18% more accurate than existing approximate subgraph matching methods. Given a query graph, the problem of subgraph isomorphism matching is to determine if a query graph is isomorphic to a subgraph of a large target graph. If the graphs include node and edge features, both the topology as well as the features should be matched. Subgraph matching is a crucial problem in many biology, social network and knowledge graph applications (Gentner, 1983; Raymond et al., 2002; Yang & Sze, 2007; Dai et al., 2019). For example, in social networks and biomedical network science, researchers investigate important subgraphs by counting them in a given network (Alon et al., 2008). In knowledge graphs, common substructures are extracted by querying them in the larger target graph (Gentner, 1983; Plotnick, 1997).
Wu, Jia (University of Technology, Sydney) | Pan, Shirui (University of Technology, Sydney) | Zhu, Xingquan (Florida Atlantic University) | Cai, Zhihua (China University of Geosciences, Wuhan) | Zhang, Chengqi (University of Technology, Sydney)
In this paper, we propose to represent and classify complicated objects. In order to represent the objects, we propose a multi-graph-view model which uses graphs constructed from multiple graph-views to represent an object. In addition, a bag based multi-graph model is further used to relax labeling by only requiring one label for a bag of graphs, which represent one object. In order to learn classification models, we propose a multi-graph-view bag learning algorithm (MGVBL), which aims to explore subgraph features from multiple graph-views for learning. By enabling a joint regularization across multiple graph-views, and enforcing labeling constraints at the bag and graph levels, MGVBL is able to discover most effective subgraph features across all graph-views for learning. Experiments on real-world learning tasks demonstrate the performance of MGVBL for complicated object classification.
Learning powerful representations is one central theme of graph neural networks (GNNs). It requires refining the critical information from the input graph, instead of the trivial patterns, to enrich the representations. Towards this end, graph attention and pooling methods prevail. They mostly follow the paradigm of "learning to attend". It maximizes the mutual information between the attended subgraph and the ground-truth label. However, this training paradigm is prone to capture the spurious correlations between the trivial subgraph and the label. Such spurious correlations are beneficial to in-distribution (ID) test evaluations, but cause poor generalization in the out-of-distribution (OOD) test data. In this work, we revisit the GNN modeling from the causal perspective. On the top of our causal assumption, the trivial information serves as a confounder between the critical information and the label, which opens a backdoor path between them and makes them spuriously correlated. Hence, we present a new paradigm of deconfounded training (DTP) that better mitigates the confounding effect and latches on the critical information, to enhance the representation and generalization ability. Specifically, we adopt the attention modules to disentangle the critical subgraph and trivial subgraph. Then we make each critical subgraph fairly interact with diverse trivial subgraphs to achieve a stable prediction. It allows GNNs to capture a more reliable subgraph whose relation with the label is robust across different distributions. We conduct extensive experiments on synthetic and real-world datasets to demonstrate the effectiveness.