Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features

Open in new window