Bridge the Modality and Capability Gaps in Vision-Language Model Selection

Open in new window