GRASP: A novel benchmark for evaluating language GRounding And Situated Physics understanding in multimodal language models