WiVi Y

Neural Information Processing Systems 

Notice that this is a contradiction because any point withWL+1 = 0 is in the setS. Hence, there exists no point at which the Hessian is negative semidefinite. This can be easily seen by replacing the convexloss with the squared loss inthe proof for Theorem 1and applying (18). Weconclude that the Hessian must beindefinite atevery saddle point under the assumptions; in other words, the Hessian has at least one strictly negative eigenvalue. B.2 Modelsandarchitectures EveryResNEst wasastandard ResNet without thebatch normalization andRectified Linear Unit (ReLU) at the final residual representation, i.e., their architectures are exactly the same before the final residual representation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found