Offline reinforcement learning with uncertainty for treatment strategies in sepsis