Option Compatible Reward Inverse Reinforcement Learning
Hwang, Rakhoon, Lee, Hanjin, Hwang, Hyung Ju
Reinforcement learning with complex tasks is a challenging problem. Often, expert demonstrations of complex multitasking operations are required to train agents. However, it is difficult to design a reward function for given complex tasks. In this paper, we solve a hierarchical inverse reinforcement learning (IRL) problem within the framework of options. A gradient method for parametrized options is used to deduce a defining equation for the Q-feature space, which leads to a reward feature space. Using a second-order optimality condition for option parameters, an optimal reward function is selected. Experimental results in both discrete and continuous domains confirm that our segmented rewards provide a solution to the IRL problem for multitasking operations and show good performance and robustness against the noise created by expert demonstrations.
Nov-6-2019
- Country:
- Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)
- Genre:
- Research Report (0.50)
- Industry:
- Technology: