Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies