Towards Understanding Grokking: An Effective Theory of Representation Learning

Neural Information Processing Systems 

We find representation learning to occur only in a "Goldilocks zone" (including comprehension and grokking) between memorization and confusion.