dae3312c4c6c7000a37ecfb7b0aeb0e4-Paper.pdf
–Neural Information Processing Systems
Based on the so-calledtensor normal(TN) distribution [31],wepropose andanalyze abrandnewapproximate natural gradient method, Tensor Normal Training(TNT), which likeShampoo, only requires knowledge of the shape of the training parameters.
Neural Information Processing Systems
Feb-11-2026, 11:05:33 GMT
- Technology: