Zero-shot Generalist Graph Anomaly Detection with Unified Neighborhood Prompts

Niu, Chaoxi, Qiao, Hezhe, Chen, Changlu, Chen, Ling, Pang, Guansong

Oct-18-2024–arXiv.org Artificial Intelligence

Graph anomaly detection (GAD), which aims to identify nodes in a graph that significantly deviate from normal patterns, plays a crucial role in broad application domains. Existing GAD methods, whether supervised or unsupervised, are onemodel-for-one-dataset approaches, i.e., training a separate model for each graph dataset. This limits their applicability in real-world scenarios where training on the target graph data is not possible due to issues like data privacy. To overcome this limitation, we propose a novel zero-shot generalist GAD approach UNPrompt that trains a one-for-all detection model, requiring the training of one GAD model on a single graph dataset and then effectively generalizing to detect anomalies in other graph datasets without any retraining or fine-tuning. The key insight in UNPrompt is that i) the predictability of latent node attributes can serve as a generalized anomaly measure and ii) highly generalized normal and abnormal graph patterns can be learned via latent node attribute prediction in a properly normalized node attribute space. UNPrompt achieves generalist GAD through two main modules: one module aligns the dimensionality and semantics of node attributes across different graphs via coordinate-wise normalization in a projected space, while another module learns generalized neighborhood prompts that support the use of latent node attribute predictability as an anomaly score across different datasets. Extensive experiments on real-world GAD datasets show that UNPrompt significantly outperforms diverse competing methods under the generalist GAD setting, and it also has strong superiority under the one-model-for-one-dataset setting. Graph anomaly detection (GAD) aims to identify anomalous nodes that exhibit significant deviations from the majority of nodes in a graph. GAD has attracted extensive research attention in recent years (Ma et al., 2021; Pang et al., 2021; Qiao et al., 2024) due to the board applications in various domains such as spam review detection in online shopping networks (McAuley & Leskovec, 2013; Rayana & Akoglu, 2015) and malicious user detection in social networks (Yang et al., 2019). To handle high-dimensional node attributes and complex structural relations between nodes, graph neural networks (GNNs) (Kipf & Welling, 2016; Wu et al., 2020) have been widely exploited for GAD due to their strong ability to integrate the node attributes and graph structures. These methods can be roughly divided into two categories, i.e., supervised and unsupervised methods. One category formulates GAD as a binary classification problem and aims to capture anomaly patterns under the guidance of labels (Tang et al., 2022; Peng et al., 2018; Gao et al., 2023; Wang et al., 2023b).

data mining, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

Oct-18-2024

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)

Genre:
- Research Report (1.00)

Industry:
- Information Technology
  - Security & Privacy (0.68)
  - Services (0.87)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Anomaly Detection (1.00)