Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits

Open in new window