Weighted Maximum Entropy Inverse Reinforcement Learning

Bui, The Viet, Mai, Tien, Jaillet, Patrick

Aug-20-2022–arXiv.org Artificial Intelligence

We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories. We propose a new way to improve the learning process by adding a weight function to the maximum entropy framework, with the motivation of having the ability to learn and recover the stochasticity (or the bounded rationality) of the expert policy. Our framework and algorithms allow to learn both a reward (or policy) function and the structure of the entropy terms added to the Markov Decision Processes, thus enhancing the learning procedure. Our numerical experiments using human and simulated demonstrations and with discrete and continuous IRL/IM tasks show that our approach outperforms prior algorithms.

algorithm, levine, popović, (15 more...)

arXiv.org Artificial Intelligence

Aug-20-2022

arXiv.org PDF

Add feedback

Country:
- Asia > Singapore (0.04)
- Europe > Sweden (0.04)
- North America > United States
  - Massachusetts (0.04)
  - Illinois > Cook County
    - Chicago (0.04)

Genre:
- Research Report (1.00)

Industry:
- Transportation > Ground > Road (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Statistical Learning > Maximum Entropy (0.61)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found