Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization

Open in new window