Global Filter Networks for Image Classification

May-28-2025, 07:38:18 GMT–Neural Information Processing Systems

Recent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer inductive biases. These models are generally based on learning interaction among spatial locations from raw data. The complexity of self-attention and MLP grows quadratically as the image size increases, which makes these models hard to scale up when high-resolution features are required. In this paper, we present the Global Filter Network (GFNet), a conceptually simple yet computationally efficient architecture, that learns long-term spatial dependencies in the frequency domain with log-linear complexity. Our architecture replaces the self-attention layer in vision transformers with three key operations: a 2D discrete Fourier transform, an element-wise multiplication between frequency-domain features and learnable global filters, and a 2D inverse Fourier transform.

artificial intelligence, data quality, machine learning, (16 more...)

Neural Information Processing Systems

May-28-2025, 07:38:18 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.14)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)
      - Perceptrons (0.68)
    - Vision (1.00)
  - Data Science > Data Quality
    - Data Transformation (0.91)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
Global Filter Networks for Image Classification

Similar Docs Excel Report more

Title	Similarity	Source
None found