Review for NeurIPS paper: A Self-Tuning Actor-Critic Algorithm
–Neural Information Processing Systems
This paper shows how to automatically optimize hyper-parameters of RL algorithms (specifically IMPALA here) by gradient descent, while the agent is learning. Initial reviews were mixed, with all reviewers seeing it as a borderline paper, but trending towards rejection. However, after taking author feedback into account and discussing the pros and cons of the submission, a consensus towards acceptance emerged. Everyone (myself included) agrees that although this work is mostly incremental, it convincingly demonstrates that hyper-parameter optimization is possible on a wide range of RL tasks. This is a meaningful contribution, given how hyper-parameters of RL algorithms can be challenging (/ computationally intensive) to tweak.
Neural Information Processing Systems
Feb-8-2025, 00:30:02 GMT
- Technology: