AAppendix: LearningGuidanceRewardswith Trajectory-spaceSmoothing A.1 Monte-CarloEstimateoftheGuidanceRewards