Reviews: Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis
–Neural Information Processing Systems
Summary The paper describes a generic 2nd order stochastic optimisation scheme exploiting curvature information to improve the trade-off between convergence speed und computational effort. It proposes an extension to the approximate natural gradient method KFAC where the Fisher information matrix is restricted to be of Kronecker structure. The authors propose to relax the Kronecker constraint and suggest to use a general diagonal scaling matrix rather than a diagonal Kronecker scaling matrix. This diagonal scaling matrix is estimated from gradients along with the Kronecker eigenbasis. Quality The idea in the paper is convincing and makes sense.
Neural Information Processing Systems
Oct-7-2024, 09:24:32 GMT
- Technology: