Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers Siyu Chen Department of Statistics and Data Science, Yale University

Feb-16-2026, 00:45:54 GMT–Neural Information Processing Systems

In particular, most existing work only theoretically explains how the attention mechanism facilitates ICL under certain data models.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Feb-16-2026, 00:45:54 GMT

Conferences PDF

Country:
- North America > United States
  - Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology (0.45)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language > Large Language Model (1.00)
    - Machine Learning
      - Statistical Learning (1.00)
      - Learning Graphical Models (0.94)
      - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers Siyu Chen Department of Statistics and Data Science, Yale University

Similar Docs Excel Report more

Title	Similarity	Source
None found