dae3312c4c6c7000a37ecfb7b0aeb0e4-Paper.pdf

Neural Information Processing Systems 

Based on the so-calledtensor normal(TN) distribution [31],wepropose andanalyze abrandnewapproximate natural gradient method, Tensor Normal Training(TNT), which likeShampoo, only requires knowledge of the shape of the training parameters.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found