ImageRef-VL: Enabling Contextual Image Referencing in Vision-Language Models

Open in new window