DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

Neural Information Processing Systems 

This occurs even under stationary data distributions, and its underlying mechanism is poorly understood.