Goto

Collaborating Authors

 induction head task


One-layer transformers fail to solve the induction heads task

Sanford, Clayton, Hsu, Daniel, Telgarsky, Matus

arXiv.org Machine Learning

The mechanistic interpretability studies of Elhage et al. (2021) and Olsson et al. (2022) identified the ubiquity and importance of so-called "induction heads" in transformer-based language models (Vaswani et al., 2017; Radford et al., 2019; Brown et al., 2020). The basic task performed by an induction head is as follows.