On layer-level control of DNN training and its impact on generalization