SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities