Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation

Open in new window