Review for NeurIPS paper: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Jan-21-2025, 13:04:00 GMT–Neural Information Processing Systems

Additional Feedback: Post-rebuttal The authors addressed some of my concerns. As the authors would redesign some of the experiments in the revision, I'd raise my score to 6. Comments and questions: 1. Are there any lower bound results on the sample complexity of planning? Are there any particular reasons, and what is the high-level idea of this algorithm? If I understand correctly this rule is to get the gap-dependent sample complexity. What if we use the simple greedy policy for the first action, and what will go wrong in the proof?

algorithm, gap-dependent sample complexity, sample complexity, (6 more...)

Neural Information Processing Systems

Jan-21-2025, 13:04:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)