ImageRef-VL: Enabling Contextual Image Referencing in Vision-Language Models