From Reproduction to Replication: Evaluating Research Agents with Progressive Code Masking