Fast active learning for pure exploration in reinforcement learning