Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models

Neural Information Processing Systems 

Despite their prevalence in deep-learning communities, over-parameterized models convey high demands of computational costs for proper training.