18fee39e2666f43cf44425138bae9def-Paper-Conference.pdf

Neural Information Processing Systems 

Although importance sampling (IS) provides a basic principle for OPE, it is ill-posed for the deterministic target policy with continuous actions.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found