Bayesian Optimization for Policy Search via Online-Offline Experimentation