Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou
–Neural Information Processing Systems
In with step-wise generally on-policy o with estimates model discrete a neural median Because trajectories T-step our IS/WIS T mo of20when re picks infinite map corresponding one T the iterations.
artificial intelligence, international conferenceon machine learning, machine learning, (12 more...)
Neural Information Processing Systems
Feb-14-2026, 21:54:43 GMT
- Country:
- Europe
- Poland (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California > Santa Clara County
- Palo Alto (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York (0.04)
- Texas > Travis County
- Austin (0.05)
- Washington > King County
- Kirkland (0.05)
- California > Santa Clara County
- Canada > Quebec
- Europe
- Technology: