When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Open in new window