Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers Siyu Chen Department of Statistics and Data Science, Yale University

Open in new window