All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment

Open in new window