Active Learning for Risk-Sensitive Inverse Reinforcement Learning

Chen, Rui, Wang, Wenshuo, Zhao, Zirui, Zhao, Ding

Sep-23-2019–arXiv.org Machine Learning

Personal use of this material is permitted. Abstract -- One typical assumption in inverse reinforcement learning (IRL) is that human experts act to optimize the expected utility of a stochastic cost with a fixed distribution. Risk-sensitive inverse reinforcement learning (RS-IRL) bridges such gap by assuming that humans act according to a random cost with respect to a set of subjectively distorted distributions instead of a fixed one. Such assumption provides the additional flexibility to model human's risk preferences, represented by a risk envelope, in safe-critical tasks. However, like other learning from demonstration techniques, RS-IRL could also suffer inefficient learning due to redundant demonstrations. Inspired by the concept of active learning, this research derives a probabilistic disturbance sampling scheme to enable an RS-IRL agent to query expert support that is likely to expose unrevealed boundaries of the expert's risk envelope. Experimental results confirm that our approach accelerates the convergence of RS-IRL algorithms with lower variance while still guaranteeing unbiased convergence. Inverse reinforcement learning (IRL) provides a novel framework for recovering cost functions utilized in human decision making [1]-[6]. The original IRL algorithms [1], [2] are formed as linear programming constrained by op-timality conditions [7]. More recent advancements in IRL include the guided cost learning algorithm [10] which combines MaxEnt IRL and deep learning techniques. The flexibility of IRL framework has prompted its application to a variety of tasks such as autonomous helicopter aerobatics [11] and robot locomotion [12].

constraint, demonstration, disturbance, (10 more...)

arXiv.org Machine Learning

Sep-23-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.14)
    - Michigan > Washtenaw County
      - Ann Arbor (0.14)
    - California > San Francisco County
      - San Francisco (0.14)
  - Canada > Ontario
    - Toronto (0.04)
- Asia
  - Taiwan > Taiwan Province
    - Taipei (0.04)
  - China > Shaanxi Province
    - Xi'an (0.04)

Genre:
- Research Report (0.70)

Industry:
- Transportation (0.88)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.54)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found