AdaptiveDiscretizationforModel-Based ReinforcementLearning

Neural Information Processing Systems 

Ouralgorithm isbasedonoptimistic one-stepvalueiteration extended to maintain an adaptive discretization of the space. From atheoretical perspective we provide worst-case regret bounds for our algorithm which are competitivecompared tothestate-of-the-art model-based algorithms.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found