Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction

Open in new window