Numerical influence of ReLU'(0) on backpropagation Supplementary Material

Neural Information Processing Systems 

This is the appendix for "Numerical influence of ReLU In Section A.1, we provide some elements of proof for Theorems 1 and 2. In Section A.2, we explain how to check the assumptions of Definition 1 by describing the special case of fully connected ReLU networks. A.1 Elements of proof of Theorems 1 and 2 The proof arguments were described in [7, 8]. We simply concentrate on justifying how the results described in these works apply to Definition 1 and point the relevant results leading to Theorems 1 and 2. It can be inferred from Definition 1 that all elements in the definition of a ReLU network training problem are piecewise smooth, where each piece is an elementary log exp function. We refer the reader to [30] for an introduction to piecewise smoothness and recent use of such notions in the context of algorithmic differentiation in [8]. Let us first argue that the results of [8] apply to Definition 1.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found