Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

Open in new window