Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation

Hanna, Josiah P. (The University of Texas at Austin) | Stone, Peter (The University of Texas at Austin) | Niekum, Scott (The University of Texas at Austin)

Feb-14-2017–AAAI Conferences

In many reinforcement learning applications, it is desirable to determine confidence interval lower bounds on the performance of any given policy without executing said policy. In this context, we propose two bootstrapping off-policy evaluation methods which use learned MDP transition models in order to estimate lower confidence bounds on policy performance with limited data. We empirically evaluate the proposed methods in a standard policy evaluation tasks.

artificial intelligence, confidence interval, machine learning, (15 more...)

AAAI Conferences

Feb-14-2017

Conferences PDF

Add feedback

Country:
- North America > United States > Texas > Travis County > Austin (0.15)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.77)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found