DeepMind just published a mind blowing paper: PathNet.
Each of those nine boxes is the PathNet at a different iteration. In this case, PathNet was trained on two different games using a Advantage Actor-critic or A3C. Although Pong and Alien seem very different at first, we observe a positive transfer learning using PathNet (take a look at the score graph). First of all, we need to define the modules. Let L be the number of layers and N be the maximum number of modules per layer (the paper indicates that N is typically 3 or 4).
Feb-21-2017, 05:40:44 GMT
- Technology: