What can a Single Attention Layer Learn? A Study Through the Random Features Lens

Open in new window