Logarithmic Smoothing for Adaptive PAC-Bayesian Off-Policy Learning

Open in new window