Review for NeurIPS paper: A Self-Tuning Actor-Critic Algorithm

Feb-8-2025, 00:30:02 GMT–Neural Information Processing Systems

This paper shows how to automatically optimize hyper-parameters of RL algorithms (specifically IMPALA here) by gradient descent, while the agent is learning. Initial reviews were mixed, with all reviewers seeing it as a borderline paper, but trending towards rejection. However, after taking author feedback into account and discussing the pros and cons of the submission, a consensus towards acceptance emerged. Everyone (myself included) agrees that although this work is mostly incremental, it convincingly demonstrates that hyper-parameter optimization is possible on a wide range of RL tasks. This is a meaningful contribution, given how hyper-parameters of RL algorithms can be challenging (/ computationally intensive) to tweak.

neurips paper, rl algorithm, self-tuning actor-critic algorithm

Neural Information Processing Systems

Feb-8-2025, 00:30:02 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)