Can VLMs Recall Factual Associations From Visual References?