Contextual Bandits and Imitation Learning with Preference-Based Active Queries
–Neural Information Processing Systems
We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.
Neural Information Processing Systems
Feb-9-2026, 01:45:40 GMT
- Country:
- North America > United States
- Washington > King County
- Seattle (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Washington > King County
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Robots (1.00)
- Representation & Reasoning (1.00)
- Natural Language (1.00)
- Machine Learning
- Reinforcement Learning (0.94)
- Statistical Learning (0.92)
- Information Technology > Artificial Intelligence