QUILL: An Algorithm-Architecture Co-Design for Cache-Local Deformable Attention
Oh, Hyunwoo, Chen, Hanning, Yun, Sanggeon, Ni, Yang, Huang, Wenjun, Das, Tamoghno, Jang, Suyeon, Imani, Mohsen
–arXiv.org Artificial Intelligence
Deformable transformers deliver state-of-the-art detection but map poorly to hardware due to irregular memory access and low arithmetic intensity. We introduce QUILL, a schedule-aware accelerator that turns deformable attention into cache-friendly, single-pass work. At its core, Distance-based Out-of-Order Querying (DOOQ) orders queries by spatial proximity; the look-ahead drives a region prefetch into an alternate buffer--forming a schedule-aware prefetch loop that overlaps memory and compute. A fused MSDeformAttn engine executes interpolation, Softmax, aggregation, and the final projection (W''m) in one pass without spilling intermediates, while small tensors are kept on-chip and surrounding dense layers run on integrated GEMMs. Implemented as RTL and evaluated end-to-end, QUILL achieves up to 7.29x higher throughput and 47.3x better energy efficiency than an RTX 4090, and exceeds prior accelerators by 3.26-9.82x in throughput and 2.01-6.07x in energy efficiency. With mixed-precision quantization, accuracy tracks FP32 within <=0.9 AP across Deformable and Sparse DETR variants. By converting sparsity into locality--and locality into utilization--QUILL delivers consistent, end-to-end speedups.
arXiv.org Artificial Intelligence
Nov-18-2025
- Country:
- Africa > Rwanda
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe
- Belgium > Flanders
- Antwerp Province > Antwerp (0.05)
- France > Île-de-France
- Spain (0.04)
- Switzerland > Zürich
- Zürich (0.14)
- United Kingdom > Scotland
- City of Glasgow > Glasgow (0.04)
- Belgium > Flanders
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California
- Los Angeles County > Long Beach (0.04)
- Monterey County > Monterey (0.04)
- Orange County > Irvine (0.04)
- San Francisco County > San Francisco (0.14)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.05)
- New Jersey > Essex County
- Newark (0.04)
- Tennessee > Davidson County
- Nashville (0.04)
- California
- Canada > Quebec
- Genre:
- Research Report (0.40)
- Technology: