Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds

Open in new window