Imitation Learning from Vague Feedback

Mar-27-2025, 13:59:24 GMT–Neural Information Processing Systems

Imitation learning from human feedback studies how to train well-performed imitation agents with an annotator's relative comparison of two demonstrations (one demonstration is better/worse than the other), which is usually easier to collect than the perfect expert data required by traditional imitation learning. However, in many real-world applications, it is still expensive or even impossible to provide a clear pairwise comparison between two demonstrations with similar quality. This motivates us to study the problem of imitation learning with vague feedback, where the data annotator can only distinguish the paired demonstrations correctly when their quality differs significantly, i.e., one from the expert and another from the nonexpert. By modeling the underlying demonstration pool as a mixture of expert and non-expert data, we show that the expert policy distribution can be recovered when the proportion α of expert data is known. We also propose a mixture proportion estimation method for the unknown α case.

demonstration, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Mar-27-2025, 13:59:24 GMT

Conferences PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States
  - California (0.28)

Genre:
- Research Report (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.67)
    - Reinforcement Learning (0.93)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
Imitation Learning from Vague Feedback The University of Tokyo, Tokyo, Japan

Similar Docs Excel Report more

Title	Similarity	Source
None found