AWAC: Accelerating Online Reinforcement Learning with Offline Datasets

#artificialintelligence 

Our method learns complex behaviors by training offline from prior datasets (expert demonstrations, data from previous experiments, or random exploration data) and then fine-tuning quickly with online interaction. Robots trained with reinforcement learning (RL) have the potential to be used across a huge variety of challenging real world problems. To apply RL to a new problem, you typically set up the environment, define a reward function, and train the robot to solve the task by allowing it to explore the new environment from scratch. While this may eventually work, these "online" RL methods are data hungry and repeating this data inefficient process for every new problem makes it difficult to apply online RL to real world robotics problems. What if instead of repeating the data collection and learning process from scratch every time, we were able to reuse data across multiple problems or experiments?

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found