Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning