Deep Reinforcement Learning Doesn't Work Yet
NAS isn't exactly tuning hyperparameters, but I think it's reasonable that neural net design decisions would act similarly. This is good news for learning, because the correlations between decision and performance are strong. Finally, not only is the reward rich, it's actually what we care about when we train models. The combination of all these points helps me understand why it "only" takes about 12800 trained networks to learn a better one, compared to the millions of examples needed in other environments. Several parts of the problem are all pushing in RL's favor.
Feb-15-2018, 05:07:09 GMT
- Country:
- North America > United States (0.28)
- Industry:
- Leisure & Entertainment > Games (1.00)
- Information Technology (1.00)
- Technology: