31839b036f63806cba3f47b93af8ccb5-Paper.pdf

Neural Information Processing Systems 

Offline reinforcement learning (RL) tasks require the agent to learn from a precollected dataset with no further interactions with the environment. Despite the potential tosurpass thebehavioral policies, RL-based methods aregenerally impractical duetothetraining instability andbootstrapping theextrapolation errors, which always require careful hyperparameter tuning via online evaluation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found