TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives Maitreya Patel Changhoon Kim

Open in new window