Generalized Probabilistic Attention Mechanism in Transformers

Open in new window