ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs Irene Huang 1 Wei Lin