Bootstrapping Statistical Inference for Off-Policy Evaluation

Hao, Botao, Ji, Xiang, Duan, Yaqi, Lu, Hao, Szepesvári, Csaba, Wang, Mengdi

Feb-9-2021–arXiv.org Machine Learning

Bootstrapping provides a flexible and effective approach for assessing the quality of batch reinforcement learning, yet its theoretical property is less understood. In this paper, we study the use of bootstrapping in off-policy evaluation (OPE), and in particular, we focus on the fitted Q-evaluation (FQE) that is known to be minimax-optimal in the tabular and linear-model cases. We propose a bootstrapping FQE method for inferring the distribution of the policy evaluation error and show that this method is asymptotically efficient and distributionally consistent for off-policy statistical inference. To overcome the computation limit of bootstrapping, we further adapt a subsampling procedure that improves the runtime by an order of magnitude. We numerically evaluate the bootrapping method in classical RL environments for confidence interval estimation, estimating the variance of off-policy evaluator, and estimating the correlation between multiple off-policy evaluators.

artificial intelligence, bootstrapping statistical inference, health & medicine, (15 more...)

arXiv.org Machine Learning

Feb-9-2021

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Alberta (0.14)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Health & Medicine (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.93)
  - Performance Analysis > Accuracy (1.00)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found