We have significantly updated the results of the paper to show that our meta-gradient

Neural Information Processing Systems 

Performance improvement of an agent trained with both actor-critic and (continuously adapting) meta-gradient-learned auxiliary question losses, compared to three baseline agents after 200M frames of training. Figure 2 shows performance on 5 such games. Curiosity-based RL uses hand-designed intrinsic rewards. Additional more complex Atari tasks to test representation learning driven only by the meta-learned auxiliary questions, compared to hand-crafted baselines. These new results show that the approach scales to complex RL domains.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found