Rapidly Bootstrapping a Question Answering Dataset for COVID-19
Tang, Raphael, Nogueira, Rodrigo, Zhang, Edwin, Gupta, Nikhil, Cam, Phuong, Cho, Kyunghyun, Lin, Jimmy
–arXiv.org Artificial Intelligence
We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge. To our knowledge, this is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available. While this dataset, comprising 124 question-article pairs as of the present version 0.1 release, does not have sufficient examples for supervised machine learning, we believe that it can be helpful for evaluating the zero-shot or transfer capabilities of existing models on topics specifically related to COVID-19. This paper describes our methodology for constructing the dataset and presents the effectiveness of a number of baselines, including term-based techniques and various transformer-based models. The dataset is available at http://covidqa.ai/
arXiv.org Artificial Intelligence
Apr-23-2020
- Country:
- North America
- United States
- New York (0.04)
- Texas (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Maryland > Montgomery County
- Gaithersburg (0.05)
- Canada > Nova Scotia
- Halifax Regional Municipality > Halifax (0.04)
- United States
- Europe > Switzerland
- Asia
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- China
- Zhejiang Province (0.04)
- Hong Kong (0.04)
- Beijing > Beijing (0.04)
- Japan > Honshū
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Technology: