To Reviewer # 1

Neural Information Processing Systems 

We thank the reviewers for all of these valuable comments. We provide point by point responses below. We will provide more discussions in the revision. Q3: "...why the authors did not choose all the tasks used in the COMA paper ..." A: Actually, all these settings are based on the SMAC framework. Q4: "...deeper analyses of the learned intrinsic reward..." A: 'attack' when the corresponding HP's are lower than We will include these discussions in the revision.