We thank all reviewers for their thoughtful feedback, which aided us in sharpening the presentation of our results

Neural Information Processing Systems 

We thank all reviewers for their thoughtful feedback, which aided us in sharpening the presentation of our results. 's questions on bounds, we will present them more explicitly in the paper, as briefly described here. We refer R1 to corollary 2.1 Combining this upper bound with the lower bound above (right term in the max), Th2 is also tight w.r.t. 's questions: our contribution focuses solely on expressiveness aspects which draw the boundaries Note that the experiments in fig.1 We are glad for R2's implementation, but since we do not know the experiment details it is hard to Indeed Kaplan et al. employ hyper-parameters tunings (LR, initializations, batch size, etc) as