Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling