Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning

Open in new window