Differentiating Policies for Non-Myopic Bayesian Optimization