An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

Larson, Stefan, Mahendran, Anish, Peper, Joseph J., Clarke, Christopher, Lee, Andrew, Hill, Parker, Kummerfeld, Jonathan K., Leach, Kevin, Laurenzano, Michael A., Tang, Lingjia, Mars, Jason

Sep-4-2019–arXiv.org Artificial Intelligence

Task-oriented dialog systems need to know when a query falls outside their range of supported intents, but current text classification corpora only define label sets that cover every example. We introduce a new dataset that includes queries that are out-of-scope-- i.e., queries that do not fall into any of the system's supported intents. This poses a new challenge because models cannot assume that every query at inference time belongs to a system-supported intent class. Our dataset also covers 150 intent classes over 10 domains, capturing the breadth that a production task-oriented agent must handle. We evaluate a range of benchmark classifiers on our dataset along with several different out-of-scope identification schemes. We find that while the classifiers perform well on in-scope intent classification, they struggle to identify out-of-scope queries. Our dataset and evaluation fill an important gap in the field, offering a way of more rigorously and realistically benchmarking text classification in task-driven dialog systems.

artificial intelligence, natural language, query, (12 more...)

arXiv.org Artificial Intelligence

Sep-4-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found