Empirical Likelihood for Contextual Bandits

Karampatziakis, Nikos, Langford, John, Mineiro, Paul

Jun-7-2019–arXiv.org Machine Learning

We apply empirical likelihood techniques to contextual bandit policy value estimation, confidence intervals, and learning. We propose a tighter estimator for off-policy evaluation with improved statistical performance over previous proposals. Coupled with this estimator is a confidence interval which also improves over previous proposals. We then harness these to improve learning from contextual bandit data. Each of these is empirically evaluated to show good performance against strong baselines in finite sample regimes.

artificial intelligence, confidence interval, machine learning, (16 more...)

arXiv.org Machine Learning

Jun-7-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found