Review for NeurIPS paper: Munchausen Reinforcement Learning
–Neural Information Processing Systems
Additional Feedback: After Authors' Reponse: I still find the paper's analysis regarding action-gaps a bit weak, and the authors' response didn't help much in that regard. I think their action-gap analysis needs to be considered under the new findings of (van Seijen et al., 2019); increasing the action-gap is not important on its own, rather it's the homogeneity of the action-gaps across the states that is important. While I still stand by my verdict of accepting this paper, in light of other reviews, I think the paper's writing should be toned down a bit in regards to its theoretical novelty and claims about empirical results (e.g. the first non-dist-RL to beat a dist-RL). Q1: To the best of my knowledge, IQN in Dopamine also uses Double Q-learning. Is this also the case for your M-IQN agent?
Neural Information Processing Systems
Jan-23-2025, 00:35:22 GMT
- Technology: