Grokking phase transitions in learning local rules with gradient descent

Open in new window