SupplementaryMaterialsfor" POLY-HOOT: Monte-CarloPlanning inContinuousSpaceMDPswithNon-AsymptoticAnalysis " AAlgorithmDetails

Neural Information Processing Systems 

Finally,defineXε, {x X: f(x) f ε}to be the set of arms that are ε-closetooptimal. Notethatwiththedepth limitation H itispossible that the nodes on depth H might be played more than once atdifferent rounds. LetT1 bethe set of nodes abovedepth H that are descendants of nodes inIH. In the following, we analyze each of the four parts individually. To proceed further, we first need to state several definitions that are useful throughout.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found