Reviews: Optimistic posterior sampling for reinforcement learning: worst-case regret bounds