A rationale from frequency perspective for grokking in training neural network

Open in new window