Near-optimal Policy Identification in Active Reinforcement Learning