How do language models learn facts? Dynamics, curricula and hallucinations

Open in new window