ImprovingDeepLearningOptimizationthrough ConstrainedParameterRegularization

Neural Information Processing Systems 

Unlike the uniform application of asingle penalty,CPR enforces an upper bound on astatistical measure, suchas theL2-norm, ofindividual parameter matrices.