LVLMs are Bad at Overhearing Human Referential Communication

Open in new window