Using AI-generated questions to train NLP systems

#artificialintelligence 

A recent approach to the popular extractive question answering (extractive QA) task that generates its own training data instead of requiring existing annotated question answering examples. Extractive QA is a popular task for natural language processing (NLP) research, where models must extract a short snippet from a document in order to answer a natural language question. Though supervised models perform well at extractive QA, they require thousands -- sometimes hundreds of thousands -- of annotated examples for training, and their performance suffers when tested outside of the textual domains and language they were trained on. By approaching extractive QA as a self-supervised task, our technique outperformed early supervised models on the widely used SQuAD data set while requiring no annotated question answering training data. The code for our method is now available to download.