Minimax Off-Policy Evaluation for Multi-Armed Bandits

Open in new window