Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning

Open in new window