Bridge the Modality and Capability Gaps in Vision-Language Model Selection

Neural Information Processing Systems 

To better reuse the VLM resource and fully leverage its potential on different zero-shot image classification tasks, a promising strategy is selecting appropriate Pre-Trained VLMs from the VLM Zoo, relying solely on the text data of the target dataset without access to the dataset's images.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found