Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models

Open in new window