[D] Better reinforcement learning algorithms than A3C? • r/MachineLearning

May-26-2018, 02:15:26 GMT–@machinelearnbot

This sounds like an underspecified example. I mean, A3C and DQN/Q-learning aren't even the same in terms of off or on-policy learning. A3C has mostly been replaced by PPO, and on-policy SOTA has moved on from that to Impala/Unicorn. I'm not sure what is SOTA for off-policy learning, but Rainbow outperforms DQN and most of the DQN zoo. And progress here may be somewhat illusory, as the methodological papers have been pointing out: a lot of these tasks are not inherently difficult, there's so much variance in training runs, improvements may be to undocumented tweaks or just somewhat better hyperparameters...

artificial intelligence, machine learning, reinforcement learning, (4 more...)

@machinelearnbot

May-26-2018, 02:15:26 GMT

News Web Page

Add feedback

Industry:
- Media > News (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found