Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition