FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning
In this paper, we propose a new type of Actor, named forward-looking Actor or FORK for short, for Actor-Critic algorithms. FORK can be easily integrated into a model-free Actor-Critic algorithm. Our experiments on six Box2D and MuJoCo environments with continuous state and action spaces demonstrate significant performance improvement FORK can bring to the state-of-the-art algorithms. A variation of FORK can further solve Bipedal-WalkerHardcore in as few as four hours using a single GPU. Deep reinforcement learning has had tremendous successes, and sometimes even superhuman performance, in a wide range of applications including board games (Silver et al., 2016), video games (Vinyals et al., 2019), and robotics (Haarnoja et al., 2018a). A key to these recent successes is the use of deep neural networks as high-capacity function approximators that can harvest a large amount of data samples to approximate high-dimensional state or action value functions, which tackles one of the most challenging issues in reinforcement learning problems with very large state and action spaces. Many modern reinforcement learning algorithms are model-free, so they are applicable in different environments and can readily react to new and unseen states. This paper considers model-free reinforcement learning for problems with continuous state and action spaces, in particular, the Actor-Critic method, where Critic evaluates the state or action values of the Actor's policy and Actor improves the policy based on the value estimation from Critic.
Oct-4-2020
- Country:
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Education (0.88)
- Leisure & Entertainment > Games
- Computer Games (0.34)
- Technology: