Extracting Rule-based Descriptions of Attention Features in Transformers

Open in new window