Learning Memory-Dependent Continuous Control from Demonstrations

Feb-18-2021–arXiv.org Artificial Intelligence

Efficient exploration has presented a long-standing challenge in reinforcement learning, especially when rewards are sparse. A developmental system can overcome this difficulty by learning from both demonstrations and self-exploration. However, existing methods are not applicable to most real-world robotic controlling problems because they assume that environments follow Markov decision processes (MDP); thus, they do not extend to partially observable environments where historical observations are necessary for decision making. This paper builds on the idea of replaying demonstrations for memory-dependent continuous control, by proposing a novel algorithm, Recurrent Actor-Critic with Demonstration and Experience Replay (READER). Experiments involving several memory-crucial continuous control tasks reveal significantly reduce interactions with the environment using our method with a reasonably small number of demonstration samples. The algorithm also shows better sample efficiency and learning capabilities than a baseline reinforcement learning algorithm for memory-based control from demonstrations.

agent, algorithm, demonstration, (15 more...)

arXiv.org Artificial Intelligence

Feb-18-2021

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - South Korea (0.04)
  - Middle East > Jordan (0.04)
  - Japan
    - Kyūshū & Okinawa > Okinawa (0.04)
    - Honshū > Kantō
      - Tochigi Prefecture > Utsunomiya (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Leisure & Entertainment (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found