SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models

Open in new window