Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

Andrea Zanette, Mykel J. Kochenderfer, Emma Brunskill

Feb-13-2026, 10:57:54 GMT–Neural Information Processing Systems

Inparticular,well knownbounds foronline learningscale as a function of the gap between the expected reward of a particular action and the optimalaction [ABF02] and also on the variance ofthe rewards [AMS09].

cisa, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Feb-13-2026, 10:57:54 GMT

Conferences PDF

Country:
- North America
  - United States > California
    - Santa Clara County > Palo Alto (0.05)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom
  - England (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Duplicate Docs Excel Report

Title
Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

Similar Docs Excel Report more

Title	Similarity	Source
None found