Does Data Scaling Lead to Visual Compositional Generalization?

Open in new window