Scattering Vision Transformer: Spectral Mixing Matters

Jan-19-2025, 18:35:22 GMT–Neural Information Processing Systems

Vision transformers have gained significant attention and achieved state-of-the-art performance in various computer vision tasks, including image classification, instance segmentation, and object detection. However, challenges remain in addressing attention complexity and effectively capturing fine-grained information within images. Existing solutions often resort to down-sampling operations, such as pooling, to reduce computational cost. Unfortunately, such operations are non-invertible and can result in information loss. In this paper, we present a novel approach called Scattering Vision Transformer (SVT) to tackle these challenges. SVT incorporates a spectrally scattering network that enables the capture of intricate image details.

scattering vision transformer, spectral mixing matter, state-of-the-art performance, (4 more...)

Neural Information Processing Systems

Jan-19-2025, 18:35:22 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Vision (1.00)