Enhancing Transformer with GNN Structural Knowledge via Distillation: A Novel Approach

Feb-27-2025–arXiv.org Artificial Intelligence

--Integrating the structural inductive biases of Graph Neural Networks (GNNs) with the global contextual modeling capabilities of Transformers represents a pivotal challenge in graph representation learning. While GNNs excel at capturing localized topological patterns through message-passing mechanisms, their inherent limitations in modeling long-range dependencies and parallelizability hinder their deployment in large-scale scenarios. Conversely, Transformers leverage self-attention mechanisms to achieve global receptive fields but struggle to inherit the intrinsic graph structural priors of GNNs. This paper proposes a novel knowledge distillation framework that systematically transfers multiscale structural knowledge from GNN teacher models to Transformer student models, offering a new perspective on addressing the critical challenges in cross-architectural distillation. This work establishes a new paradigm for inheriting graph structural biases in Transformer architectures, with broad application prospects.

distillation, graph neural network, transformer, (12 more...)

arXiv.org Artificial Intelligence

Feb-27-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Shanghai > Shanghai (0.05)
- North America > United States
  - California > San Mateo County > Burlingame (0.04)

Genre:
- Overview > Innovation (0.40)
- Research Report > Promising Solution (0.40)

Industry:
- Education (0.35)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.35)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found