Mitigating Graph Covariate Shift via Score-based Out-of-distribution Augmentation

Oct-22-2024–arXiv.org Artificial Intelligence

Distribution shifts between training and testing datasets significantly impair the model performance on graph learning. A commonly-taken causal view in graph invariant learning suggests that stable predictive features of graphs are causally associated with labels, whereas varying environmental features lead to distribution shifts. In particular, covariate shifts caused by unseen environments in test graphs underscore the critical need for out-of-distribution (OOD) generalization. Existing graph augmentation methods designed to address the covariate shift often disentangle the stable and environmental features in the input space, and selectively perturb or mixup the environmental features. However, such perturbationbased methods heavily rely on an accurate separation of stable and environmental features, and their exploration ability is confined to existing environmental features in the training distribution. To overcome these limitations, we introduce a novel approach using score-based graph generation strategies that synthesize unseen environmental features while preserving the validity and stable features of overall graph patterns. Our comprehensive empirical evaluations demonstrate the enhanced effectiveness of our method in improving graph OOD generalization. Deep learning algorithms have become predominant in the analysis of graph-structured data. However, a common limitation of existing methods is the assumption that both training and testing graphs are independently and identically distributed (i.i.d.). This assumption often falls short in real-world scenarios, where shifts in data distribution frequently occur, leading to significant degradation in model performance.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Oct-22-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.93)

Industry:
- Health & Medicine > Therapeutic Area (0.53)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)