400e5e6a7ce0c754f281525fae75a873-Supplemental.pdf

Neural Information Processing Systems 

We now provide some basic results about pairs of ring and false-ring MDPs that we will use periodicallyinourproofs. Lemma1. Wefirstconsiderthecasewhenn<,notingthatthatboth MDPs are deterministic and that for any states, performing n transitions will always return tos. WenowassumethatV contains atleastoneconstant function andΠisnon-empty andproduce an instance of an environment and model class where the relation is strict. Let the environment be a K-state ring environment (see A.1.1):mK Corollary 1. Letdet be the set of all deterministic policies.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found