SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Open in new window