Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL Qin-Wen Luo

Neural Information Processing Systems 

Offline reinforcement learning (RL) aims to learn a policy from a fixed dataset without additional interactions with the environment.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found