Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers Siyu Chen Department of Statistics and Data Science, Yale University
–Neural Information Processing Systems
In particular, most existing work only theoretically explains how the attention mechanism facilitates ICL under certain data models.
Neural Information Processing Systems
Feb-16-2026, 00:45:54 GMT
- Country:
- North America > United States
- Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Information Technology (0.45)
- Technology: