Evaluating the Robustness of Open-Source Vision-Language Models to Domain Shift in Object Captioning

Open in new window