Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees

Oct-7-2025–arXiv.org Machine Learning

This article introduces the theory of offline reinforcement learning in large state spaces, where good policies are learned from historical data without online interactions with the environment. Key concepts introduced include expressivity assumptions on function approximation (e.g., Bellman completeness vs. realizability) and data coverage (e.g., all-policy vs. single-policy coverage). A rich landscape of algorithms and results is described, depending on the assumptions one is willing to make and the sample and computational complexity guarantees one wishes to achieve. We also discuss open questions and connections to adjacent areas.

algorithm, assumption, international conference, (12 more...)

arXiv.org Machine Learning

Oct-7-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Wisconsin > Dane County
    - Madison (0.04)
  - Massachusetts > Middlesex County
    - Belmont (0.04)
  - Illinois > Champaign County
    - Urbana (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.50)
- Workflow (0.46)

Industry:
- Health & Medicine (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Statistical Learning > Regression (0.46)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found