Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Tengyang Xie, Yifei Ma, Yu-Xiang Wang

Feb-12-2026, 03:43:58 GMT–Neural Information Processing Systems

Solving OPE is often the starting point in many RL applications. To tackle the problem of OPE, the idea of importance sampling (IS) corrects the mismatch in the distributions under the behavior policy and target policy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Feb-12-2026, 03:43:58 GMT

Conferences PDF

Country:
- North America
  - Canada (0.04)
  - United States
    - Massachusetts > Hampshire County
      - Amherst (0.04)
    - Illinois > Champaign County
      - Urbana (0.04)
    - California
      - Santa Clara County > Palo Alto (0.04)
      - Santa Barbara County > Santa Barbara (0.04)
      - San Mateo County > East Palo Alto (0.04)
- Europe > Sweden
  - Stockholm > Stockholm (0.04)

Industry:
- Health & Medicine (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (0.93)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)

Duplicate Docs Excel Report

Title
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Similar Docs Excel Report more

Title	Similarity	Source
None found