Clustering in pure-attention hardmax transformers and its role in sentiment analysis

Open in new window