Object-level Vision-Language Contrastive Pre-training

Open in new window