Morris, Christopher
Survey on Generalization Theory for Graph Neural Networks
Vasileiou, Antonis, Jegelka, Stefanie, Levie, Ron, Morris, Christopher
Message-passing graph neural networks (MPNNs) have emerged as the leading approach for machine learning on graphs, attracting significant attention in recent years. While a large set of works explored the expressivity of MPNNs, i.e., their ability to separate graphs and approximate functions over them, comparatively less attention has been directed toward investigating their generalization abilities, i.e., making meaningful predictions beyond the training data. Here, we systematically review the existing literature on the generalization abilities of MPNNs. We analyze the strengths and limitations of various studies in these domains, providing insights into their methodologies and findings. Furthermore, we identify potential avenues for future research, aiming to deepen our understanding of the generalization abilities of MPNNs.
Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks
Bechler-Speicher, Maya, Finkelshtein, Ben, Frasca, Fabrizio, Müller, Luis, Tönshoff, Jan, Siraudin, Antoine, Zaverkin, Viktor, Bronstein, Michael M., Niepert, Mathias, Perozzi, Bryan, Galkin, Mikhail, Morris, Christopher
While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.
Towards graph neural networks for provably solving convex optimization problems
Qian, Chendi, Morris, Christopher
Recently, message-passing graph neural networks (MPNNs) have shown potential for solving combinatorial and continuous optimization problems due to their ability to capture variable-constraint interactions. While existing approaches leverage MPNNs to approximate solutions or warm-start traditional solvers, they often lack guarantees for feasibility, particularly in convex optimization settings. Here, we propose an iterative MPNN framework to solve convex optimization problems with provable feasibility guarantees. First, we demonstrate that MPNNs can provably simulate standard interior-point methods for solving quadratic problems with linear constraints, covering relevant problems such as SVMs. Secondly, to ensure feasibility, we introduce a variant that starts from a feasible point and iteratively restricts the search within the feasible region. Experimental results show that our approach outperforms existing neural baselines in solution quality and feasibility, generalizes well to unseen problem sizes, and, in some cases, achieves faster solution times than state-of-the-art solvers such as Gurobi.
Covered Forest: Fine-grained generalization analysis of graph neural networks
Vasileiou, Antonis, Finkelshtein, Ben, Geerts, Floris, Levie, Ron, Morris, Christopher
The expressive power of message-passing graph neural networks (MPNNs) is reasonably well understood, primarily through combinatorial techniques from graph isomorphism testing. However, MPNNs' generalization abilities -- making meaningful predictions beyond the training set -- remain less explored. Current generalization analyses often overlook graph structure, limit the focus to specific aggregation functions, and assume the impractical, hard-to-optimize $0$-$1$ loss function. Here, we extend recent advances in graph similarity theory to assess the influence of graph structure, aggregation, and loss functions on MPNNs' generalization abilities. Our empirical study supports our theoretical insights, improving our understanding of MPNNs' generalization properties.
Cometh: A continuous-time discrete-state graph diffusion model
Siraudin, Antoine, Malliaros, Fragkiskos D., Morris, Christopher
Discrete-state denoising diffusion models led to state-of-the-art performance in graph generation, especially in the molecular domain. Recently, they have been transposed to continuous time, allowing more flexibility in the reverse process and a better trade-off between sampling efficiency and quality. Here, to leverage the benefits of both approaches, we propose Cometh, a continuous-time discrete-state graph diffusion model, integrating graph data into a continuous-time diffusion model framework. Empirically, we show that integrating continuous time leads to significant improvements across various metrics over state-of-the-art discrete-state diffusion models on a large set of molecular and non-molecular benchmark datasets.
Probabilistic Graph Rewiring via Virtual Nodes
Qian, Chendi, Manolache, Andrei, Morris, Christopher, Niepert, Mathias
Message-passing graph neural networks (MPNNs) have emerged as a powerful paradigm for graph-based machine learning. Despite their effectiveness, MPNNs face challenges such as under-reaching and over-squashing, where limited receptive fields and structural bottlenecks hinder information flow in the graph. While graph transformers hold promise in addressing these issues, their scalability is limited due to quadratic complexity regarding the number of nodes, rendering them impractical for larger graphs. Here, we propose implicitly rewired message-passing neural networks (IPR-MPNNs), a novel approach that integrates implicit probabilistic graph rewiring into MPNNs. By introducing a small number of virtual nodes, i.e., adding additional nodes to a given graph and connecting them to existing nodes, in a differentiable, end-to-end manner, IPR-MPNNs enable long-distance message propagation, circumventing quadratic complexity. Theoretically, we demonstrate that IPR-MPNNs surpass the expressiveness of traditional MPNNs. Empirically, we validate our approach by showcasing its ability to mitigate under-reaching and over-squashing effects, achieving state-of-the-art performance across multiple graph datasets. Notably, IPR-MPNNs outperform graph transformers while maintaining significantly faster computational efficiency.
Aligning Transformers with Weisfeiler-Leman
Müller, Luis, Morris, Christopher
Graph neural network architectures aligned with the $k$-dimensional Weisfeiler--Leman ($k$-WL) hierarchy offer theoretically well-understood expressive power. However, these architectures often fail to deliver state-of-the-art predictive performance on real-world graphs, limiting their practical utility. While recent works aligning graph transformer architectures with the $k$-WL hierarchy have shown promising empirical results, employing transformers for higher orders of $k$ remains challenging due to a prohibitive runtime and memory complexity of self-attention as well as impractical architectural assumptions, such as an infeasible number of attention heads. Here, we advance the alignment of transformers with the $k$-WL hierarchy, showing stronger expressivity results for each $k$, making them more feasible in practice. In addition, we develop a theoretical framework that allows the study of established positional encodings such as Laplacian PEs and SPE. We evaluate our transformers on the large-scale PCQM4Mv2 dataset, showing competitive predictive performance with the state-of-the-art and demonstrating strong downstream performance when fine-tuning them on small-scale molecular datasets. Our code is available at https://github.com/luis-mueller/wl-transformers.
Weisfeiler-Leman at the margin: When more expressivity matters
Franks, Billy J., Morris, Christopher, Velingker, Ameya, Geerts, Floris
The Weisfeiler-Leman algorithm ($1$-WL) is a well-studied heuristic for the graph isomorphism problem. Recently, the algorithm has played a prominent role in understanding the expressive power of message-passing graph neural networks (MPNNs) and being effective as a graph kernel. Despite its success, $1$-WL faces challenges in distinguishing non-isomorphic graphs, leading to the development of more expressive MPNN and kernel architectures. However, the relationship between enhanced expressivity and improved generalization performance remains unclear. Here, we show that an architecture's expressivity offers limited insights into its generalization performance when viewed through graph isomorphism. Moreover, we focus on augmenting $1$-WL and MPNNs with subgraph information and employ classical margin theory to investigate the conditions under which an architecture's increased expressivity aligns with improved generalization performance. In addition, we show that gradient flow pushes the MPNN's weights toward the maximum margin solution. Further, we introduce variations of expressive $1$-WL-based kernel and MPNN architectures with provable generalization properties. Our empirical study confirms the validity of our theoretical findings.
Towards Principled Graph Transformers
Müller, Luis, Kusuma, Daniel, Morris, Christopher
Graph Neural Networks (GNNs) are the de-facto standard in graph learning [Kipf and Welling, 2017, Gilmer et al., 2017, Scarselli et al., 2009, Xu et al., 2019] but suffer from limited expressivity in distinguishing non-isomorphic graphs in terms of the 1-dimensional Weisfeiler-Leman algorithm (1-WL) [Morris et al., 2019, Xu et al., 2019]. Hence, recent works introduced higher-order GNNs, aligned with the k-dimensional Weisfeiler-Leman (k-WL) hierarchy for graph isomorphism testing [Azizian and Lelarge, 2021, Morris et al., 2019, 2020, 2022], resulting in more expressivity with an increase in k > 1. The k-WL hierarchy draws from a rich history in graph theory [Babai, 1979, Babai and Kucera, 1979, Babai et al., 1980, Cai et al., 1992, Weisfeiler and Leman, 1968], offering a deep theoretical understanding of k-WL-aligned GNNs. While theoretically intriguing, higher-order GNNs often fail to deliver state-of-the-art performance on real-world problems, making theoretically grounded models less relevant in practice [Azizian and Lelarge, 2021, Morris et al., 2020, 2022]. In contrast, graph transformers [Glickman and Yahav, 2023, He et al., 2023, Ma et al., 2023, Rampášek et al., 2022, Ying et al., 2021] recently demonstrated state-of-the-art empirical performance. However, they draw their expressive power mostly from positional/structural encodings (PEs), making it difficult to understand these models in terms of an expressivity hierarchy such as the k-WL. While a few works theoretically aligned graph transformers with the k-WL hierarchy [Kim et al., 2021, 2022, Zhang et al., 2023], we are not aware of any works
Future Directions in Foundations of Graph Machine Learning
Morris, Christopher, Dym, Nadav, Maron, Haggai, Ceylan, İsmail İlkan, Frasca, Fabrizio, Levie, Ron, Lim, Derek, Bronstein, Michael, Grohe, Martin, Jegelka, Stefanie
Machine learning on graphs, especially using graph neural networks (GNNs), has seen a surge in interest due to the wide availability of graph data across a broad spectrum of disciplines, from life to social and engineering sciences. Despite their practical success, our theoretical understanding of the properties of GNNs remains highly incomplete. Recent theoretical advancements primarily focus on elucidating the coarse-grained expressive power of GNNs, predominantly employing combinatorial techniques. However, these studies do not perfectly align with practice, particularly in understanding the generalization behavior of GNNs when trained with stochastic first-order optimization techniques. In this position paper, we argue that the graph machine learning community needs to shift its attention to developing a more balanced theory of graph machine learning, focusing on a more thorough understanding of the interplay of expressive power, generalization, and optimization.