A Adaptations of Algorithm 1 for different problems

Neural Information Processing Systems 

We extend Algorithm 1 to stochastic gradient descent (SGD). Algorithm 3 here modifies Algorithm 1 to allow transformations on both parameters and data. In this section, we derive the group actions for the test functions and multi-layer neural networks. More details about group theory can be found in textbooks such as Lang (2002). B.1 Continuous symmetry in test functions B.1.1 Ellipse Consider the following loss function with a 2 R However, we will only use the 2 variable version in the experiments.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found