DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning

Open in new window