Exploring Failure Cases in Multimodal Reasoning About Physical Dynamics