Tightness of bounds in Theorem 3.1 All reviewers

Neural Information Processing Systems 

For the class of networks mentioned, the last inequality becomes trivial given we have MSE loss. Weights can be found for ReLU networks such that the other inequalities are tight. We discuss the use of post-activations in section C.2. From Figure Thus, the theorem applies to a successful method. To the reviewer's other point, we also note that ReLU is Lipschitz continuous with constant 1. Thus, we believe the algorithm can be widely applied.