Using Vision-Language Models as Proxies for Social Intelligence in Human-Robot Interaction

Open in new window