A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules