Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing Systems 

This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found