text and bibliography following their suggestions

Neural Information Processing Systems 

We thank the reviewers for the helpful feedback and the positive assessment of our submission. Reviewer #1, "It is interesting to see if further increase the width of the network (from linear in d to polynomial in d and In the setting of our paper (minimization of the total network size) a large depth is in some sense unavoidable (as e.g. However, in general there is of course some trade-off between width and depth. Reviewer #4, "Theorem 5.1 extends the approximation results to all piece-wise linear activation functions and not just So in theory, this should also apply to max-outs and other variants of ReLUs such as Leaky ReLUs?" That's right, all these functions are easily expressible one via another using just linear operations ( Reviewer #4, "I fail to see some intuitions regarding the typical values of r, d, and H for the networks used in practice. T. Poggio et al., Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found