From Reproduction to Replication: Evaluating Research Agents with Progressive Code Masking

Open in new window