[Reinforcement learning] Can we learn action embedding as high level goals? • /r/MachineLearning
I've recently read Karpathy's blogpost about reinforcement learning and current techniques. Which got me thinking about few ideas. We perform learning differently than Policy gradients, MDP and similar methods. That is, we don't evaluate in each state every possible action and decide what's the most beneficial one. Instead we have layers of actions here each layer describes our strategy more abstractly and more high-level.
Jun-2-2016, 13:00:57 GMT
- Technology: