Can LLM Substitute Human Labeling? A Case Study of Fine-grained Chinese Address Entity Recognition Dataset for UAV Delivery

Yao, Yuxuan, Luo, Sichun, Zhao, Haohan, Deng, Guanzhi, Song, Linqi

Mar-19-2024–arXiv.org Artificial Intelligence

We present CNER-UAV, a fine-grained \textbf{C}hinese \textbf{N}ame \textbf{E}ntity \textbf{R}ecognition dataset specifically designed for the task of address resolution in \textbf{U}nmanned \textbf{A}erial \textbf{V}ehicle delivery systems. The dataset encompasses a diverse range of five categories, enabling comprehensive training and evaluation of NER models. To construct this dataset, we sourced the data from a real-world UAV delivery system and conducted a rigorous data cleaning and desensitization process to ensure privacy and data integrity. The resulting dataset, consisting of around 12,000 annotated samples, underwent human experts and \textbf{L}arge \textbf{L}anguage \textbf{M}odel annotation. We evaluated classical NER models on our dataset and provided in-depth analysis. The dataset and models are publicly available at \url{https://github.com/zhhvvv/CNER-UAV}.

annotation, dataset, preprint arxiv, (9 more...)

arXiv.org Artificial Intelligence

Mar-19-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Asia
  - Singapore > Central Region
    - Singapore (0.05)
  - China
    - Hong Kong (0.05)
    - Guangdong Province > Shenzhen (0.05)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Robots > Autonomous Vehicles
    - Drones (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found