Bounding the Optimal Value Function in Compositional Reinforcement Learning