Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence

Open in new window