Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning Danruo Deng

Neural Information Processing Systems 

To address such the'sensitivity-stability' dilemma, most previous efforts have been contributed to minimizing the empirical risk with different parameter regularization