Reviews: Adaptive Auxiliary Task Weighting for Reinforcement Learning
–Neural Information Processing Systems
I think the results are much more comprehensive now. I raised my score accordingly. If I understand the main idea correctly, the proposed method can be interpreted as a gradient-based meta-learning method (e.g., MAML) in that the algorithm finds the gradient of the main objective by taking into account the parameter update procedure. It would be good to provide this perspective and also review the relevant work on meta-gradients for RL (e.g., MAML [Finn et al.], Meta-gradient RL [Xu et al.], Learning intrinsic reward [Zheng et al.]). Nevertheless, I think this is a novel application of meta-gradient for tuning auxiliary task weights.
Neural Information Processing Systems
Jan-21-2025, 13:48:05 GMT
- Technology: