Rethinking Value Function Learning for Generalization in Reinforcement Learning