Evaluating Text Output in NLP: BLEU at your own risk

Jan-16-2019, 13:43:38 GMT–#artificialintelligence

One question I get fairly often from folks who are just getting into NLP is how to evaluate systems when the output of that system is text, rather than some sort of classification of the input text. These types of problems, where you put some text into your model and get some other text out of it, are known as sequence to sequence or string transduction problems. This sort of technology is right out of science fiction. With such a wide range of exciting applications, it's easy to see why sequence to sequence modeling is more popular than ever. What's not easy is actually evaluating these systems. Unfortunately for folks who are just getting started, there's no simple answer about what metric you should use to evaluate your model. Even worse, one of the most popular metrics for evaluating sequence to sequence tasks, BLEU, has major drawbacks, especially when applied to tasks that it was never intended to evaluate.

artificial intelligence, natural language, translation, (18 more...)

#artificialintelligence

Jan-16-2019, 13:43:38 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.97)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found