Recurrent model-free RL can be a strong baseline for many POMDPs

Open in new window