Self-attention as an attractor network: transient memories without backpropagation