Behavioral Testing of NLP models with CheckList

Sep-5-2020, 12:42:26 GMT–#artificialintelligence

When developing an NLP model, it's a standard practice to test how well a model generalizes to unseen examples by evaluating it on a held-out dataset. Suppose we reach our target performance metric of 95% on a held-out dataset and thus deploy the model to production based on this single metric. But, when real users start using it, the story could be completely different than what our 95% performance metric was saying. Our model might perform poorly even on simple variations of the training text. In contrast, the field of software engineering uses a suite of unit tests, integration tests, and end-to-end tests to evaluate all aspects of the product for failures.

artificial intelligence, natural language, text processing, (19 more...)

#artificialintelligence

Sep-5-2020, 12:42:26 GMT

News Web Page

Add feedback

Country:
- North America > United States > Illinois > Cook County > Chicago (0.05)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.50)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found