Reviews: Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning
–Neural Information Processing Systems
The introduction of gossip algorithms to Deep-RL is original. The work is generally clearly presented, but some of the reported baseline results do not match previous published works. Figure 1: The IMPALA results look completely off, as do the A3C results on pong, and the A3C results in the appendix. There shouldn't be such a discrepancy between A3C and IMPALA when running with the same hyperparameters (there is a larger discrepancy on some games, but not these ones). I suspect a bug in the IMPALA implementation, or at least an unfair comparison due to all other results using a more recent (and hence more tuned) set of hyperparameters from [Stooke&Abbeel 2018].
Neural Information Processing Systems
Jan-26-2025, 04:04:29 GMT
- Technology: