Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models