Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors

May-28-2025, 10:21:07 GMT–Neural Information Processing Systems

The crucial insight is that a normalized signal-dependent graph learning module amounts to a variant of the basic self-attention mechanism in conventional transformers. Unlike "black-box" transformers that require learning of large key, query and value matrices to compute scaled dot products as affinities and subsequent output embeddings, resulting in huge parameter sets, our unrolled networks employ shallow CNNs to learn lowdimensional features per node to establish pairwise Mahalanobis distances and construct sparse similarity graphs. At each layer, given a learned graph, the target interpolated signal is simply a low-pass filtered output derived from the minimization of an assumed graph smoothness prior, leading to a dramatic reduction in parameter count. Experiments for two image interpolation applications verify the restoration performance, parameter efficiency and robustness to covariate shift of our graph-based unrolled networks compared to conventional transformers.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

May-28-2025, 10:21:07 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada > Ontario > Toronto (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Neural Networks > Deep Learning (0.68)
      - Statistical Learning (0.68)
    - Natural Language (1.00)
    - Representation & Reasoning (1.00)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (0.94)

Duplicate Docs Excel Report

Title
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors

Similar Docs Excel Report more

Title	Similarity	Source
None found