Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms

Open in new window