Pragmatic Inference with a CLIP Listener for Contrastive Captioning

Open in new window