A Reduction from Apprenticeship Learning to Classification

Feb-16-2024, 09:52:37 GMT–Neural Information Processing Systems

We provide new theoretical results for apprenticeship learning, a variant of reinforcement learning in which the true reward function is unknown, and the goal is to perform well relative to an observed expert. We study a common approach to learning from expert demonstrations: using a classification algorithm to learn to imitate the expert's behavior. Although this straightforward learning strategy is widely-used in practice, it has been subject to very little formal analysis. We prove that, if the learned classifier has error rate \eps, the difference between the value of the apprentice's policy and the expert's policy is O(\sqrt{\eps}) . Further, we prove that this difference is only O(\eps) when the expert's policy is close to optimal.

apprenticeship learning, classification, demonstration, (1 more...)

Neural Information Processing Systems

Feb-16-2024, 09:52:37 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)