Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages