Alleviating the Inequality of Attention Heads for Neural Machine Translation

Open in new window