Learning One Representation to Optimize All Rewards