ViT AE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
–Neural Information Processing Systems
In this way, it acquires an intrinsic scale invariance IB and is able to learn robust feature representation for objects at various scales.
Neural Information Processing Systems
Aug-18-2025, 17:50:58 GMT