A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model