Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking

Open in new window