Toward Human-Like Evaluation for Natural Language Generation with Error Analysis

Open in new window