VITA: Video Instance Segmentation via Object Token Association

Open in new window