S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist Captions Sangwoo Mo1,2 Minkyu Kim 1,3 Kyungmin Lee 1 Jinwoo Shin