AdaptiveDiscretizationforModel-Based ReinforcementLearning
–Neural Information Processing Systems
Ouralgorithm isbasedonoptimistic one-stepvalueiteration extended to maintain an adaptive discretization of the space. From atheoretical perspective we provide worst-case regret bounds for our algorithm which are competitivecompared tothestate-of-the-art model-based algorithms.
Neural Information Processing Systems
Feb-7-2026, 21:23:16 GMT