Supplementary Material for Fast Vision Transformers with HiLo Attention

Aug-15-2025, 04:17:02 GMT–Neural Information Processing Systems

Department of Data Science & AI, Monash University, Australia We organize our supplementary material as follows. In Section A, we describe the architecture specifications of LITv2. In Section B, we provide the derivation for the computational cost of HiLo attention. In Section C, we study the effect of window size based on CIFAR-100. In Section F, we provide more visualisation examples for spectrum analysis of HiLo attention. We use "ConvFFN Block" to differentiate our "ConvFFN" denotes our modified FFN layer where we adopt one layer of The overall framework of LITv2 is depicted in Figure I.

feature map, transformer, window size, (14 more...)

Neural Information Processing Systems

Aug-15-2025, 04:17:02 GMT

Conferences PDF

Add feedback

Country:
- Oceania > Australia (0.24)

Technology:
- Information Technology > Artificial Intelligence > Vision (0.87)

Duplicate Docs Excel Report

Title
5d5f703ee1dedbfe324b1872f44db939-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found