A Study of Automatic Metrics for the Evaluation of Natural Language Explanations