MemVLT: Vision-LanguageTrackingwithAdaptive Memory-basedPrompts

Neural Information Processing Systems 

As an extension of traditional visual single object tracking (SOT) task [2, 3, 4], VLT can harness the complementary advantages of multiple modalities. Therefore, vision-language trackers (VLTs) have the potential to achieve more promising tracking performance, which has recently attracted widespreadattention[5,6,7,8].

Similar Docs  Excel Report  more

TitleSimilaritySource
None found