A Bayesian Approach to Robust Inverse Reinforcement Learning

Wei, Ran, Zeng, Siliang, Li, Chenliang, Garcia, Alfredo, McDonald, Anthony, Hong, Mingyi

Sep-15-2023–arXiv.org Artificial Intelligence

Inverse reinforcement learning (IRL) is the problem of extracting the reward function and policy of a value-maximizing agent from its behavior [1, 2]. IRL is an important tool in domains where manually specifying reward functions or policies is difficult, such as in autonomous driving [3], or when the extracted reward function can reveal novel insights about a target population and be used to device interventions, such as in biology, economics, and human-robot interaction studies [4, 5, 6]. However, wider applications of IRL face two interrelated algorithmic challenges: 1) having access to the target deployment environment or an accurate simulator thereof and 2) robustness of the learned policy and reward function due to the covariate shift between the training and deployment environments or datasets [7, 8, 9]. In this paper, we focus on model-based offline IRL to address challenge 1). A notable class of model-based offline IRL methods estimate the dynamics and reward in a two-stage fashion (see Figure 1) [10, 11, 12, 13]. In the first stage, a Figure 1: Objectives of the traditional two-stage dynamics model is estimated from the offline IRL and the proposed simultaneous estimation approach of Bayesian model-based IRL.

arxiv preprint arxiv, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Sep-15-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > Texas (0.14)

Genre:
- Research Report (0.50)

Industry:
- Automobiles & Trucks (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)
    - Reinforcement Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found