Option Compatible Reward Inverse Reinforcement Learning

Hwang, Rakhoon, Lee, Hanjin, Hwang, Hyung Ju

Nov-6-2019–arXiv.org Machine Learning

Reinforcement learning with complex tasks is a challenging problem. Often, expert demonstrations of complex multitasking operations are required to train agents. However, it is difficult to design a reward function for given complex tasks. In this paper, we solve a hierarchical inverse reinforcement learning (IRL) problem within the framework of options. A gradient method for parametrized options is used to deduce a defining equation for the Q-feature space, which leads to a reward feature space. Using a second-order optimality condition for option parameters, an optimal reward function is selected. Experimental results in both discrete and continuous domains confirm that our segmented rewards provide a solution to the IRL problem for multitasking operations and show good performance and robustness against the noise created by expert demonstrations.

algorithm, intra-option policy, reward function, (16 more...)

arXiv.org Machine Learning

Nov-6-2019

arXiv.org PDF

Add feedback

Country:
- Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)

Genre:
- Research Report (0.50)

Industry:
- Transportation
  - Ground > Road (0.46)
  - Passenger (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found