Self-attention as an attractor network: transient memories without backpropagation

Open in new window