UniViT: Unifying Image and Video Understanding in One Vision Encoder

Open in new window