Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Open in new window