Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee
–Neural Information Processing Systems
We propose stochastic ensemble value expansion (STEVE), a novel model-based technique that addresses this issue. By dynamically interpolating between model rollouts of various horizon lengths for each individual example, STEVE ensures that the model is only utilized when doing so does not introduce significant errors.
Neural Information Processing Systems
Nov-20-2025, 21:06:17 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > Sweden
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > California
- Santa Clara County > Mountain View (0.04)
- Canada > Quebec
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Technology: