A Set of Recommendations for Assessing Human-Machine Parity in Language Translation