Social Biases in Automatic Evaluation Metrics for NLG