43207fd5e34f87c48d584fc5c11befb8-Paper.pdf
–Neural Information Processing Systems
It is well believed that model-based RL, where the agent learns the model of the environment and then performs planning in the model, is significantly more sample efficient than model-free RL. Recent empirical advances also justify such a belief (e.g.
Neural Information Processing Systems
Oct-2-2025, 18:48:37 GMT