Home

Jan-18-2017, 10:21:44 GMT–AITopics Original Links

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences, i.e. ROUGE, between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results. For the inception of ROUGE, please read Lin & Hovy's HLT-NAACL 2003 (Lin and Hovy 2003) paper. For more details, please read Lin's paper "ROUGE: a Package for Automatic Evaluation of Summaries" (Lin 2004a).

artificial intelligence, evaluation, natural language, (5 more...)

AITopics Original Links

Jan-18-2017, 10:21:44 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)