VRL3: AData-DrivenFrameworkforVisualDeep ReinforcementLearning

Neural Information Processing Systems 

Our framework has three stages: instage 1,we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number of expert demonstrations) to convert the task-agnostic representations intomorepowerfultask-specific representations; in stage 3, we fine-tune the agent with online RL.