Near-OptimalReinforcementLearning inDynamicTreatmentRegimes