Improving word mover's distance by leveraging self-attention matrix