Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

Nov-20-2025, 20:37:37 GMT–Neural Information Processing Systems

In this paper, we propose a new off-policy estimator that applies IS directly on the stationary state-visitation distributions to avoid the exploding variance faced by existing methods.

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Nov-20-2025, 20:37:37 GMT

Conferences PDF

Country:
- North America
  - United States
    - Texas > Travis County
      - Austin (0.14)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - California > Santa Clara County
      - Palo Alto (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Industry:
- Transportation (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Reinforcement Learning (0.94)

Duplicate Docs Excel Report

Title
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Similar Docs Excel Report more

Title	Similarity	Source
None found