2a91de02871011d0090e662ffd6f2328-Supplemental-Conference.pdf

Neural Information Processing Systems 

The structure of the appendix mainly follows the roadmap of the proof described in Section 4.4. In Appendix A, we define the characterizable population risk function in (31) to approximate the objective function. Also, some notations to simplify the analysis are introduced in Appendix A, and we recommend the readers to refer to Table 3 for the major notations used in the proofs. Instead, in this paper, we consider multi-layer cases and need to derive a lower bound for the Hessian matrix for all the layers. Instead, the input of the intermediate layer cannot be proved to be Gaussian but belong to sub-Gaussian distribution.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found