Naturalness Evaluation of Natural Language Generation in Task-oriented Dialogues using BERT