[D] Better reinforcement learning algorithms than A3C? • r/MachineLearning

@machinelearnbot 

This sounds like an underspecified example. I mean, A3C and DQN/Q-learning aren't even the same in terms of off or on-policy learning. A3C has mostly been replaced by PPO, and on-policy SOTA has moved on from that to Impala/Unicorn. I'm not sure what is SOTA for off-policy learning, but Rainbow outperforms DQN and most of the DQN zoo. And progress here may be somewhat illusory, as the methodological papers have been pointing out: a lot of these tasks are not inherently difficult, there's so much variance in training runs, improvements may be to undocumented tweaks or just somewhat better hyperparameters...

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found