Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

Open in new window