Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?

Open in new window