Gaussian Process Bandits for Tree Search: Theory and Application to Planning in Discounted MDPs