Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

Aug-15-2025, 18:42:17 GMT–Neural Information Processing Systems

Our regret bound improves upon the results of [Jin et al., 2018] and

algorithm, ucb-a dvantage, update rule, (10 more...)

Neural Information Processing Systems

Aug-15-2025, 18:42:17 GMT

Conferences PDF

Country:
- North America
  - United States > Illinois (0.04)
  - Canada (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China (0.04)

Genre:
- Research Report (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
ad71c82b22f4f65b9398f76d8be4c615-Paper.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found