Assessing the Ability of Self-Attention Networks to Learn Word Order

Open in new window