Droplets of Good Representations: Grokking as a First Order Phase Transition in Two Layer Networks

Open in new window