Poursafaei, Farimah
UTG: Towards a Unified View of Snapshot and Event Based Models for Temporal Graphs
Huang, Shenyang, Poursafaei, Farimah, Rabbany, Reihaneh, Rabusseau, Guillaume, Rossi, Emanuele
Temporal graphs have gained increasing importance due to their ability to model dynamically evolving relationships. These graphs can be represented through either a stream of edge events or a sequence of graph snapshots. Until now, the development of machine learning methods for both types has occurred largely in isolation, resulting in limited experimental comparison and theoretical crosspollination between the two. In this paper, we introduce Unified Temporal Graph (UTG), a framework that unifies snapshot-based and event-based machine learning models under a single umbrella, enabling models developed for one representation to be applied effectively to datasets of the other. We also propose a novel UTG training procedure to boost the performance of snapshot-based models in the streaming setting. We comprehensively evaluate both snapshot and event-based models across both types of temporal graphs on the temporal link prediction task. Our main findings are threefold: first, when combined with UTG training, snapshotbased models can perform competitively with event-based models such as TGN and GraphMixer even on event datasets. Second, snapshot-based models are at least an order of magnitude faster than most event-based models during inference. Third, while event-based methods such as NAT and DyGFormer outperforms snapshotbased methods on both types of temporal graphs, this is because they leverage joint neighborhood structural features thus emphasizing the potential to incorporate these features into snapshot-based models as well. These findings highlight the importance of comparing model architectures independent of the data format and suggest the potential of combining the efficiency of snapshot-based models with the performance of event-based models in the future.
TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs
Gastinger, Julia, Huang, Shenyang, Galkin, Mikhail, Loghmani, Erfan, Parviz, Ali, Poursafaei, Farimah, Danovitch, Jacob, Rossi, Emanuele, Koutis, Ioannis, Stuckenschmidt, Heiner, Rabbany, Reihaneh, Rabusseau, Guillaume
Multi-relational temporal graphs are powerful tools for modeling real-world data, capturing the evolving and interconnected nature of entities over time. Recently, many novel models are proposed for ML on such graphs intensifying the need for robust evaluation and standardized benchmark datasets. However, the availability of such resources remains scarce and evaluation faces added complexity due to reproducibility issues in experimental protocols. To address these challenges, we introduce Temporal Graph Benchmark 2.0 (TGB 2.0), a novel benchmarking framework tailored for evaluating methods for predicting future links on Temporal Knowledge Graphs and Temporal Heterogeneous Graphs with a focus on large-scale datasets, extending the Temporal Graph Benchmark. TGB 2.0 facilitates comprehensive evaluations by presenting eight novel datasets spanning five domains with up to 53 million edges. TGB 2.0 datasets are significantly larger than existing datasets in terms of number of nodes, edges, or timestamps. In addition, TGB 2.0 provides a reproducible and realistic evaluation pipeline for multi-relational temporal graphs. Through extensive experimentation, we observe that 1) leveraging edge-type information is crucial to obtain high performance, 2) simple heuristic baselines are often competitive with more complex methods, 3) most methods fail to run on our largest datasets, highlighting the need for research on more scalable methods.
Temporal Graph Rewiring with Expander Graphs
Petroviฤ, Katarina, Huang, Shenyang, Poursafaei, Farimah, Veliฤkoviฤ, Petar
Evolving relations in real-world networks are often modelled by temporal graphs. Graph rewiring techniques have been utilised on Graph Neural Networks (GNNs) to improve expressiveness and increase model performance. In this work, we propose Temporal Graph Rewiring (TGR), the first approach for graph rewiring on temporal graphs. TGR enables communication between temporally distant nodes in a continuous time dynamic graph by utilising expander graph propagation to construct a message passing highway for message passing between distant nodes. Expander graphs are suitable candidates for rewiring as they help overcome the oversquashing problem often observed in GNNs. On the public tgbl-wiki benchmark, we show that TGR improves the performance of a widely used TGN model by a significant margin. Our code repository is accessible at https://github.com/kpetrovicc/TGR.git .
On the Scalability of GNNs for Molecular Graphs
Sypetkowski, Maciej, Wenkel, Frederik, Poursafaei, Farimah, Dickson, Nia, Suri, Karush, Fradkin, Philip, Beaini, Dominique
Scaling deep learning models has been at the heart of recent revolutions in language modelling and image generation. Practitioners have observed a strong relationship between model size, dataset size, and performance. However, structure-based architectures such as Graph Neural Networks (GNNs) are yet to show the benefits of scale mainly due to the lower efficiency of sparse operations, large data requirements, and lack of clarity about the effectiveness of various architectures. We address this drawback of GNNs by studying their scaling behavior. Specifically, we analyze message-passing networks, graph Transformers, and hybrid architectures on the largest public collection of 2D molecular graphs. For the first time, we observe that GNNs benefit tremendously from the increasing scale of depth, width, number of molecules, number of labels, and the diversity in the pretraining datasets. We further demonstrate strong finetuning scaling behavior on 38 highly competitive downstream tasks, outclassing previous large models. This gives rise to MolGPS, a new graph foundation model that allows to navigate the chemical space, outperforming the previous state-of-the-arts on 26 out the 38 downstream tasks. We hope that our work paves the way for an era where foundational GNNs drive pharmaceutical drug discovery.
Temporal Graph Analysis with TGX
Shirzadkhani, Razieh, Huang, Shenyang, Kooshafar, Elahe, Rabbany, Reihaneh, Poursafaei, Farimah
Real-world networks, with their evolving relations, are best captured as temporal graphs. However, existing software libraries are largely designed for static graphs where the dynamic nature of temporal graphs is ignored. Bridging this gap, we introduce TGX, a Python package specially designed for analysis of temporal networks that encompasses an automated pipeline for data loading, data processing, and analysis of evolving graphs. TGX provides access to eleven built-in datasets and eight external Temporal Graph Benchmark (TGB) datasets as well as any novel datasets in the .csv format. Beyond data loading, TGX facilitates data processing functionalities such as discretization of temporal graphs and node subsampling to accelerate working with larger datasets. For comprehensive investigation, TGX offers network analysis by providing a diverse set of measures, including average node degree and the evolving number of nodes and edges per timestamp. Additionally, the package consolidates meaningful visualization plots indicating the evolution of temporal patterns, such as Temporal Edge Appearance (TEA) and Temporal Edge Trafficc (TET) plots. The TGX package is a robust tool for examining the features of temporal graphs and can be used in various areas like studying social networks, citation networks, and tracking user interactions. We plan to continuously support and update TGX based on community feedback. TGX is publicly available on: https://github.com/ComplexData-MILA/TGX.
Temporal Graph Benchmark for Machine Learning on Temporal Graphs
Huang, Shenyang, Poursafaei, Farimah, Danovitch, Jacob, Fey, Matthias, Hu, Weihua, Rossi, Emanuele, Leskovec, Jure, Bronstein, Michael, Rabusseau, Guillaume, Rabbany, Reihaneh
We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, and robust evaluation of machine learning models on temporal graphs. TGB datasets are of large scale, spanning years in duration, incorporate both node and edge-level prediction tasks and cover a diverse set of domains including social, trade, transaction, and transportation networks. For both tasks, we design evaluation protocols based on realistic use-cases. We extensively benchmark each dataset and find that the performance of common models can vary drastically across datasets. In addition, on dynamic node property prediction tasks, we show that simple methods often achieve superior performance compared to existing temporal graph models. We believe that these findings open up opportunities for future research on temporal graphs. Finally, TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research, including data loading, experiment setup and performance evaluation. TGB will be maintained and updated on a regular basis and welcomes community feedback. TGB datasets, data loaders, example codes, evaluation setup, and leaderboards are publicly available at https://tgb.complexdatalab.com/.
Towards Temporal Edge Regression: A Case Study on Agriculture Trade Between Nations
Jiang, Lekang, Zhang, Caiqi, Poursafaei, Farimah, Huang, Shenyang
Recently, Graph Neural Networks (GNNs) have shown promising performance in tasks on dynamic graphs such as node classification, link prediction and graph regression. However, few work has studied the temporal edge regression task which has important real-world applications. In this paper, we explore the application of GNNs to edge regression tasks in both static and dynamic settings, focusing on predicting food and agriculture trade values between nations. We introduce three simple yet strong baselines and comprehensively evaluate one static and three dynamic GNN models using the UN Trade dataset. Our experimental results reveal that the baselines exhibit remarkably strong performance across various settings, highlighting the inadequacy of existing GNNs. We also find that TGN outperforms other GNN models, suggesting TGN is a more appropriate choice for edge regression tasks. Moreover, we note that the proportion of negative edges in the training samples significantly affects the test performance. The companion source code can be found at: https://github.com/scylj1/GNN_
Towards Improved Illicit Node Detection with Positive-Unlabelled Learning
Luo, Junliang, Poursafaei, Farimah, Liu, Xue
We demonstrate the difference The nature of anonymity and decentralization of blockchain between the estimated values of evaluation metrics and the systems are making changes in the finance industry due actual values through an engineered PU dataset from the to its immutability, transparency, and automation [1]. Such Ethereum transaction dataset proposed in [15] to show the decentralized systems, however, are in the current stage of a concerns of assuming unlabeled data to be normal. We conduct temporarily unregulated environment [2], [3] with a variety of experiments to show that applying various PU classifiers can abnormal usages and security concerns. The abnormal usages help in improving the classification performance on two realworld include both the illicit activities clearly defined by traditional datasets with limited positive labels. The PU classifiers fiance: phishing scams, Ponzi schemes, money laundering, estimate potential identifiable class prior or treat the unlabeled etc. [4], and also the activities with no clear definition of examples as negative samples with label noise and learn with lawfulness or being just defined such as mixing services, i.e., biased models. We also compare various graph representation the mixer nodes involve in funds to confuse the trace of the methods for extracting node embedding vectors as the input transfers from the original source, e.g., US Department of the to get diverse data distribution for the same data to obtain Treasury declared Tornado Cash as a sanctioned entity [5], [6].