Review for NeurIPS paper: Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
–Neural Information Processing Systems
Additional Feedback: The authors' response has addressed my questions. I will keep my score. This is a natural question to ask, so it could be worth an explanation somewhere. However, this paper suggests a slower rate by a factor of (1-\gamma) {-2}. What could cause the difference and how could the theory here guide development of deep RL algorithms?
Neural Information Processing Systems
Jan-23-2025, 02:04:22 GMT
- Technology: