Review for NeurIPS paper: Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition

Jan-27-2025, 14:16:47 GMT–Neural Information Processing Systems

The paper shows a model-free algorithm with an improved regret bound for finite-state finite-horizon MDP problems. The new bound closes the gap with the best model-based result. This is a nice theoretical contribution.

artificial intelligence, machine learning, model-free reinforcement learningvia reference-advantage decomposition, (1 more...)

Neural Information Processing Systems

Jan-27-2025, 14:16:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)