The Art of Scaling Test-Time Compute for Large Language Models