Few-NERD: A Few-Shot Named Entity Recognition Dataset
Ding, Ning, Xu, Guangwei, Chen, Yulin, Wang, Xiaobin, Han, Xu, Xie, Pengjun, Zheng, Hai-Tao, Liu, Zhiyuan
–arXiv.org Artificial Intelligence
Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at https://ningding97.github.io/fewnerd/.
arXiv.org Artificial Intelligence
Jun-2-2021
- Country:
- Asia (1.00)
- Europe > United Kingdom (1.00)
- North America
- Canada (0.68)
- United States
- California (0.28)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Education (0.67)
- Government > Regional Government (0.67)
- Health & Medicine (0.68)
- Leisure & Entertainment > Sports (0.67)
- Media (0.93)
- Transportation > Air (0.93)
- Technology: