Active Model Estimation in Markov Decision Processes