Predicting optimal value functions by interpolating reward functions in scalarized multi-objective reinforcement learning