Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers

Open in new window