Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning