TopViewRS: Vision-Language Models as Top-View Spatial Reasoners

Open in new window