The Power of Offline Reinforcement Learning
Reinforcement learning has grown rapidly in the past few years, from tabular methods that can only solve simple toy problems to powerful algorithms that tackle incredibly complex problems such as playing Go, learning robotic manipulation skills or controlling autonomous vehicles. Unfortunately, adoption of RL for real-world applications has been somewhat slow, and while current RL methods have proven their ability to find high performing policies for challenging problems with high-dimensional raw observations (such as images), actually using them is often difficult or impractical. This is in stark contrast to supervised learning methods, which are highly prevalent in many fields of industry and research and are utilized with great success. Most RL research papers and implementations are geared towards the online learning setting, in which the agent interacts with an environment and gathers data, using its current policy and some exploration scheme to explore the state-action space and find higher-reward areas. Such online RL algorithms interact with the environment and use the gathered experience either immediately or via some replay buffer to update the policy.
Nov-9-2020, 18:30:22 GMT