Provably Efficient Reinforcement Learning via Surprise Bound

Open in new window