Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions Haanvid Lee
–Neural Information Processing Systems
We consider local kernel metric learning for off-policy evaluation (OPE) of deterministic policies in contextual bandits with continuous action spaces.
Neural Information Processing Systems
Oct-2-2025, 16:28:00 GMT
- Country:
- Asia > India (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- Massachusetts > Hampshire County
- Amherst (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Massachusetts > Hampshire County
- Canada > Quebec
- Industry:
- Technology: