AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix

Open in new window