400e5e6a7ce0c754f281525fae75a873-Supplemental.pdf
–Neural Information Processing Systems
We now provide some basic results about pairs of ring and false-ring MDPs that we will use periodicallyinourproofs. Lemma1. Wefirstconsiderthecasewhenn<,notingthatthatboth MDPs are deterministic and that for any states, performing n transitions will always return tos. WenowassumethatV contains atleastoneconstant function andΠisnon-empty andproduce an instance of an environment and model class where the relation is strict. Let the environment be a K-state ring environment (see A.1.1):mK Corollary 1. Letdet be the set of all deterministic policies.
Neural Information Processing Systems
Feb-8-2026, 08:55:46 GMT
- Industry:
- Leisure & Entertainment > Sports (0.47)