S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist Captions

Open in new window