Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

Andrea Zanette, Mykel J. Kochenderfer, Emma Brunskill

Aug-19-2025, 22:49:33 GMT–Neural Information Processing Systems

Markov Decision Process (MDP) provided that we can access the reward and transition function through a generative model.

algorithm, neural information processing system, sample complexity, (13 more...)

Neural Information Processing Systems

Aug-19-2025, 22:49:33 GMT

Conferences PDF

Country:
- North America
  - Canada (0.04)
  - United States > California
    - Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom
  - England
    - Greater London > London (0.04)
    - Cambridgeshire > Cambridge (0.04)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.47)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language > Generation (0.62)
    - Machine Learning
      - Reinforcement Learning (0.71)
      - Learning Graphical Models > Undirected Networks
        Markov Models (0.49)

Duplicate Docs Excel Report

Title
Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

Similar Docs Excel Report more

Title	Similarity	Source
None found