Goto

Collaborating Authors

 swintrack-b-384




Appendix 1 Positional Encoding

Neural Information Processing Systems

Obviously, the self-attention module is permutation-invariance. Thus it can not "understand" the Before working with our tracker's encoder and decoder network, we need to extend the untied positional encoding to a multi-dimensional version. Together with relative positional bias, for an n-dimensional case, we have: α ij . From Tab. 1, we can observe The result shows our tracker is still competitive. Our tracker obtained the best performance on this benchmark.


SwinTrack: A Simple and Strong Baseline for Transformer Tracking Liting Lin 1,2 Heng Fan

Neural Information Processing Systems

Recently Transformer has been largely explored in tracking and shown state-of-the-art (SOT A) performance. However, existing efforts mainly focus on fusing and enhancing features generated by convolutional neural networks (CNNs). The potential of Transformer in representation learning remains under-explored.