Contextual Bandits and Imitation Learning with Preference-Based Active Queries
–Neural Information Processing Systems
We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.
Neural Information Processing Systems
Feb-9-2026, 01:45:40 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Washington > King County
- Seattle (0.04)
- Massachusetts > Middlesex County
- Asia > Middle East
- Genre:
- Research Report (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Reinforcement Learning (0.94)
- Statistical Learning (0.92)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Robots (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence