Mitigating Strategy-Selection Bias in Reasoning for More Effective Test-Time Scaling