A Phase Transition between Positional and Semantic Learning in a Solvable Model of Dot-Product Attention

Open in new window