Goto

Collaborating Authors

 graph neural tangent kernel


Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Neural Information Processing Systems

While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to \emph{infinitely wide} multi-layer GNNs trained by gradient descent. GNTKs enjoy the full expressive power of GNNs and inherit advantages of GKs. Theoretically, we show GNTKs provably learn a class of smooth functions on graphs. Empirically, we test GNTKs on graph classification datasets and show they achieve strong performance.


Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Neural Information Processing Systems

While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to \emph{infinitely wide} multi-layer GNNs trained by gradient descent.


Simplifying Graph Kernels for Efficient

Wang, Lin, Wang, Shijie, Huang, Sirui, Li, Qing

arXiv.org Artificial Intelligence

While kernel methods and Graph Neural Networks offer complementary strengths, integrating the two has posed challenges in efficiency and scalability. The Graph Neural Tangent Kernel provides a theoretical bridge by interpreting GNNs through the lens of neural tangent kernels. However, its reliance on deep, stacked layers introduces repeated computations that hinder performance. In this work, we introduce a new perspective by designing the simplified graph kernel, which replaces deep layer stacking with a streamlined $K$-step message aggregation process. This formulation avoids iterative layer-wise propagation altogether, leading to a more concise and computationally efficient framework without sacrificing the expressive power needed for graph tasks. Beyond this simplification, we propose another Simplified Graph Kernel, which draws from Gaussian Process theory to model infinite-width GNNs. Rather than simulating network depth, this kernel analytically computes kernel values based on the statistical behavior of nonlinear activations in the infinite limit. This eliminates the need for explicit architecture simulation, further reducing complexity. Our experiments on standard graph and node classification benchmarks show that our methods achieve competitive accuracy while reducing runtime. This makes them practical alternatives for learning on graphs at scale. Full implementation and reproducibility materials are provided at: https://anonymous.4open.science/r/SGNK-1CE4/.


Reviews: Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Neural Information Processing Systems

Pros. - The derivation of explicit formulae, which I think is correct, is neat and requires work. Since the title is GNTK, it may be better to include analysis of graph convolutional layers. Theorems 4.2 and 4.3 are more like technical lemmas whose usefulnesses are in doubt per se. Furthermore, the argument of polynomial'' sample complexity relies on the condition given in line 193 (please consider numbering the equations). There is no discussion in relation to the applicability of this condition.



Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Neural Information Processing Systems

While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to \emph{infinitely wide} multi-layer GNNs trained by gradient descent.


Graph Neural Tangent Kernel: Convergence on Large Graphs

Krishnagopal, Sanjukta, Ruiz, Luana

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) achieve remarkable performance in graph machine learning tasks but can be hard to train on large-graph data, where their learning dynamics are not well understood. We investigate the training dynamics of large-graph GNNs using graph neural tangent kernels (GNTKs) and graphons. In the limit of large width, optimization of an overparametrized NN is equivalent to kernel regression on the NTK. Here, we investigate how the GNTK evolves as another independent dimension is varied: the graph size. We use graphons to define limit objects -- graphon NNs for GNNs, and graphon NTKs for GNTKs -- , and prove that, on a sequence of graphs, the GNTKs converge to the graphon NTK. We further prove that the spectrum of the GNTK, which is related to the directions of fastest learning which becomes relevant during early stopping, converges to the spectrum of the graphon NTK. This implies that in the large-graph limit, the GNTK fitted on a graph of moderate size can be used to solve the same task on the large graph, and to infer the learning dynamics of the large-graph GNN. These results are verified empirically on node regression and classification tasks.


Graph Convolutional Networks from the Perspective of Sheaves and the Neural Tangent Kernel

Gebhart, Thomas

arXiv.org Artificial Intelligence

Graph convolutional networks are a popular class of deep neural network algorithms which have shown success in a number of relational learning tasks. Despite their success, graph convolutional networks exhibit a number of peculiar features, including a bias towards learning oversmoothed and homophilic functions, which are not easily diagnosed due to the complex nature of these algorithms. We propose to bridge this gap in understanding by studying the neural tangent kernel of sheaf convolutional networks--a topological generalization of graph convolutional networks. To this end, we derive a parameterization of the neural tangent kernel for sheaf convolutional networks which separates the function into two parts: one driven by a forward diffusion process determined by the graph, and the other determined by the composite effect of nodes' activations on the output layer. This geometrically-focused derivation produces a number of immediate insights which we discuss in detail.


Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Du, Simon S., Hou, Kangcheng, Póczos, Barnabás, Salakhutdinov, Ruslan, Wang, Ruosong, Xu, Keyulu

arXiv.org Artificial Intelligence

While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to \emph{infinitely wide} multi-layer GNNs trained by gradient descent. GNTKs enjoy the full expressive power of GNNs and inherit advantages of GKs. Theoretically, we show GNTKs provably learn a class of smooth functions on graphs. Empirically, we test GNTKs on graph classification datasets and show they achieve strong performance.