Attention-Only Transformers and Implementing MLPs with Attention Heads

Sep-15-2023–arXiv.org Artificial Intelligence

The transformer architecture was introduced in the landmark 2017 paper Attention is All You Need (Vaswani et al., 2023) and traditionally consists of alternating attention and multilayer-perceptron (MLP) sublayers. Although initially used for machine translation, transformers have been used across a wide range of tasks, including language modeling (Radford et al., 2018; Devlin et al., 2019; Liu et al., 2018), computer vision (Khan et al., 2022; Cornia et al., 2020), and image generation (Parmar et al., 2018). The widespread deployment of transformers has led to increasing interest in mechanistic interpretability (Wang et al., 2022; Conmy et al., 2023), which seeks to convert the computations of transformers into human-understandable explanations. Some interpretability efforts, such as Elhage et al. (2021), focused on attention-only transformers, finding that MLP layers were harder to interpret. This work seeks to supplement those mechanistic interpretability methods by showing that MLP layers in transformers are equivalent to a sum of masked attention heads and therefore can be subjected to interpretability techniques that work on attention-only transformers. In Theorem 3 we show that by including a "bias token" akin to the persistent memory vectors in Sukhbaatar et al. (2019) and using a slightly unusual attention-masking pattern, an MLP layer of size l can be written as the sum of l attention heads with internal dimension 1. We show in Theorem 6 that one can apply this process throughout the entire transformer, converting the typical MLP-and-attention transformer into an attention-only transformer. We then show in Theorems 7 and 8 that attention heads can implement row-wise linear transformations and matrix-level activation functions separately. Finally, we show in Theorem 9 that a slightly augmented network is capable of approximating any masking pattern to within arbitrary error.

attention head, matrix, transformer, (14 more...)

arXiv.org Artificial Intelligence

Sep-15-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Perceptrons (0.54)
    - Deep Learning (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found