HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
–Neural Information Processing Systems
Recent progress in vision Transformers exhibits great success in various tasks driven by the new spatial modeling mechanism based on dot-product self-attention. In this paper, we show that the key ingredients behind the vision Transformers, namely input-adaptive, long-range and high-order spatial interactions, can also be efficiently implemented with a convolution-based framework. We present the Recursive Gated Convolution ($\textit{g}^\textit{n}$Conv) that performs high-order spatial interactions with gated convolutions and recursive designs. The new operation is highly flexible and customizable, which is compatible with various variants of convolution and extends the two-order interactions in self-attention to arbitrary orders without introducing significant extra computation.
Neural Information Processing Systems
Dec-24-2025, 02:48:53 GMT
- Technology:
- Information Technology > Artificial Intelligence > Vision (0.83)