Goto

Collaborating Authors

 Reinforcement Learning








VRL3: AData-DrivenFrameworkforVisualDeep ReinforcementLearning

Neural Information Processing Systems

Our framework has three stages: instage 1,we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number of expert demonstrations) to convert the task-agnostic representations intomorepowerfultask-specific representations; in stage 3, we fine-tune the agent with online RL.



58a799d16fb0c1f2014e98f4ba972b25-Paper-Conference.pdf

Neural Information Processing Systems

RL that utilize function approximation to generalize observational data to unknown states/actions. The goal of this paper is to study the sample complexity of policy-based RL, which is arguably the simplest setting for RL with function approximation (Kearns et al., 1999; Kakade, 2003).