graph rationale
Improving Graph Out-of-distribution Generalization on Real-world Data
Xu, Can, Cheng, Yao, Yu, Jianxiang, Wang, Haosen, Lv, Jingsong, Li, Xiang
Existing methods for graph out-of-distribution (OOD) generalization primarily rely on empirical studies on synthetic datasets. Such approaches tend to overemphasize the causal relationships between invariant sub-graphs and labels, thereby neglecting the non-negligible role of environment in real-world scenarios. In contrast to previous studies that impose rigid independence assumptions on environments and invariant sub-graphs, this paper presents the theorems of environment-label dependency and mutable rationale invariance, where the former characterizes the usefulness of environments in determining graph labels while the latter refers to the mutable importance of graph rationales. Based on analytic investigations, a novel variational inference based method named ``Probability Dependency on Environments and Rationales for OOD Graphs on Real-world Data'' (DEROG) is introduced. To alleviate the adverse effect of unknown prior knowledge on environments and rationales, DEROG utilizes generalized Bayesian inference. Further, DEROG employs an EM-based algorithm for optimization. Finally, extensive experiments on real-world datasets under different distribution shifts are conducted to show the superiority of DEROG. Our code is publicly available at https://anonymous.4open.science/r/DEROG-536B.
Invariant Graph Transformer
Xu, Zhe, Pan, Menghai, Chen, Yuzhong, Chen, Huiyuan, Yan, Yuchen, Das, Mahashweta, Tong, Hanghang
Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In graph machine learning context, graph rationale is defined to locate the critical subgraph in the given graph topology, which fundamentally determines the prediction results. In contrast to the rationale subgraph, the remaining subgraph is named the environment subgraph. Graph rationalization can enhance the model performance as the mapping between the graph rationale and prediction label is viewed as invariant, by assumption. To ensure the discriminative power of the extracted rationale subgraphs, a key technique named "intervention" is applied. The core idea of intervention is that given any changing environment subgraphs, the semantics from the rationale subgraph is invariant, which guarantees the correct prediction result. However, most, if not all, of the existing rationalization works on graph data develop their intervention strategies on the graph level, which is coarse-grained. In this paper, we propose well-tailored intervention strategies on graph data. Our idea is driven by the development of Transformer models, whose self-attention module provides rich interactions between input nodes. Based on the self-attention module, our proposed invariant graph Transformer (IGT) can achieve fine-grained, more specifically, node-level and virtual node-level intervention. Our comprehensive experiments involve 7 real-world datasets, and the proposed IGT shows significant performance advantages compared to 13 baseline methods.