31839b036f63806cba3f47b93af8ccb5-Paper.pdf
–Neural Information Processing Systems
Offline reinforcement learning (RL) tasks require the agent to learn from a precollected dataset with no further interactions with the environment. Despite the potential tosurpass thebehavioral policies, RL-based methods aregenerally impractical duetothetraining instability andbootstrapping theextrapolation errors, which always require careful hyperparameter tuning via online evaluation.
Neural Information Processing Systems
Feb-19-2026, 00:14:57 GMT
- Technology: