PRiSM: An Agentic Multimodal Benchmark for Scientific Reasoning via Python-Grounded Evaluation