On the Partial Identifiability in Reward Learning: Choosing the Best Reward

Lazzati, Filippo, Metelli, Alberto Maria

Jan-10-2025–arXiv.org Machine Learning

When the feedback is not informative enough, the target However, in practice, ReL has been successfully applied reward is only partially identifiable, i.e., there only to IL (Ho & Ermon, 2016) and reward design (Christiano exists a set of rewards (the feasible set) that are et al., 2017). The most significant issue that prevents equally-compatible with the feedback. In this paper, the use of ReL algorithms to other applications is partial we show that there exists a choice of reward, identifiability (Cao et al., 2021; Kim et al., 2021; Skalse non-necessarily contained in the feasible set that, et al., 2023b). Indeed, the target reward may not be uniquely depending on the ReL application, improves the determined from the given feedback, but there is a set of reward performance w.r.t.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

Jan-10-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Reinforcement Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found