From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens

Open in new window