Reviews: Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

Jan-26-2025, 04:04:29 GMT–Neural Information Processing Systems

The introduction of gossip algorithms to Deep-RL is original. The work is generally clearly presented, but some of the reported baseline results do not match previous published works. Figure 1: The IMPALA results look completely off, as do the A3C results on pong, and the A3C results in the appendix. There shouldn't be such a discrepancy between A3C and IMPALA when running with the same hyperparameters (there is a larger discrepancy on some games, but not these ones). I suspect a bug in the IMPALA implementation, or at least an unfair comparison due to all other results using a more recent (and hence more tuned) set of hyperparameters from [Stooke&Abbeel 2018].

deep reinforcement learning, gossip-based actor-learner architecture, hyperparameter, (11 more...)

Neural Information Processing Systems

Jan-26-2025, 04:04:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)