Control Variates for Slate Off-Policy Evaluation: Supplementary Text Nikos Vlassis Netflix Ashok Chandrashekar WarnerMedia Fernando Amat Gil Netflix Nathan Kallus Cornell University and Netflix

Neural Information Processing Systems 

To train the lasso models we used sklearn.linear_model.Lasso (with parameters fit_intercept = False, max_iter = 500, tol = 1e-4, normalize = False, precompute = False, copy_X = False, The author was with Netflix when this work was concluded. Results are qualitatively very similar to what we obtained with tree-based models.