Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Open in new window