How do language models learn facts? Dynamics, curricula and hallucinations