Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning
Borde, Haitz Sáez de Ocáriz, Lukoianov, Artem, Kratsios, Anastasis, Bronstein, Michael, Dong, Xiaowen
–arXiv.org Artificial Intelligence
Traditionally, Graph Neural Networks (GNNs) [1] have primarily been applied to model functions over graphs with a relatively modest number of nodes. However, recently there has been a growing interest in exploring the application of GNNs to large-scale graph benchmarks, including datasets with up to a hundred million nodes [2]. This exploration could potentially lead to better models for industrial applications such as large-scale network analysis in social media, where there are typically millions of users, or in biology, where proteins and other macromolecules are composed of a large number of atoms. This presents a significant challenge in designing GNNs that are scalable while retaining their effectiveness. To this end, we take inspiration from the literature on Large Language Models (LLMs) and propose a simple modification to how GNN architectures are typically arranged. Our framework, Scalable Message Passing Neural Networks (SMPNNs), enables the construction of deep and scalable architectures that outperform the current state-of-the-art models for large graph benchmarks in transductive classification. More specifically, we find that following the typical construction of the Pre-Layer Normalization (Pre-LN) Transformer formulation [3] and replacing attention with standard message-passing convolution is enough to outperform the best Graph Transformers in the literature. Moreover, since our formulation does not necessarily require attention, our architecture scales better than Graph Transformers.
arXiv.org Artificial Intelligence
Oct-29-2024
- Country:
- North America
- Canada (0.46)
- United States (0.71)
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Technology: