In-Context Learning Improves Compositional Understanding of Vision-Language Models