The problem with DDPG: understanding failures in deterministic environments with sparse rewards

Open in new window