You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection Y uxin Fang 1 Bencheng Liao 1 Xinggang Wang 1 Jiemin Fang 2, 1
–Neural Information Processing Systems
To answer this question, we present Y ou Only Look at One Sequence (YOLOS), a series of object detection models based on the vanilla Vision Transformer with the fewest possible modifications, region priors, as well as inductive biases of the target task.
Neural Information Processing Systems
Nov-15-2025, 20:37:27 GMT