Few-NERD: A Few-Shot Named Entity Recognition Dataset

Ding, Ning, Xu, Guangwei, Chen, Yulin, Wang, Xiaobin, Han, Xu, Xie, Pengjun, Zheng, Hai-Tao, Liu, Zhiyuan

Jun-2-2021–arXiv.org Artificial Intelligence

Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at https://ningding97.github.io/fewnerd/.

computational linguistic, entity type, ew -nerd, (13 more...)

arXiv.org Artificial Intelligence

Jun-2-2021

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
  - New South Wales > Sydney (0.04)
- North America
  - United States
    - Nevada > Clark County
      - Las Vegas (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Illinois > Cook County
      - Chicago (0.04)
    - California
      - Los Angeles County > Long Beach (0.04)
      - San Diego County > San Diego (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Iceland (0.04)
  - Germany > Berlin (0.04)
  - Austria > Vienna (0.04)
  - United Kingdom
    - England (0.04)
    - Scotland > City of Edinburgh
      - Edinburgh (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Japan (0.04)
  - Vietnam (0.04)
  - Singapore (0.04)
  - Middle East > Kuwait (0.04)
  - India > Goa (0.04)
  - China
    - Tibet Autonomous Region (0.04)
    - Hong Kong (0.04)
    - Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Media (0.93)
- Transportation > Air (0.93)
- Health & Medicine (0.68)
- Leisure & Entertainment > Sports (0.67)
- Government > Regional Government (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found