BOP-ASK: Object-Interaction Reasoning for Vision-Language Models

Open in new window