The paradox of the compositionality of natural language: a neural machine translation case study