Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models

Neural Information Processing Systems 

We study the problem of off-policy evaluation (OPE) for episodic Partially Observable Markov Decision Processes (POMDPs) with continuous states.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found