MKOR: Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1 Updates
–Neural Information Processing Systems
This work proposes a Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1 Updates, called MKOR, that improves the training time and convergence properties of deep neural networks (DNNs). Second-order techniques, while enjoying higher convergence rates vs first-order counterparts, have cubic complexity with respect to either the model size and/or the training batch size.
Neural Information Processing Systems
May-28-2025, 22:22:54 GMT