Towards Better Generalization: Weight Decay Induces Low-rank Bias for Neural Networks

Open in new window