From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens