Improving Multimodal Contrastive Learning of Sentence Embeddings with Object-Phrase Alignment

Open in new window