d072677d210ac4c03ba046120f0802ec-AuthorFeedback.pdf

Neural Information Processing Systems 

We respond to the concerns point-by-point as below. Why distilling prioritized paths improves architecture rating? The more sufficient/full training of subnets leads to a more accurate architecture rating [6](Sec.4.3). The set used to train the matching network? We will revise the manuscript to make this point clearer.