NERsocial: Efficient Named Entity Recognition Dataset Construction for Human-Robot Interaction Utilizing RapidNER
Atuhurra, Jesse, Kamigaito, Hidetaka, Ouchi, Hiroki, Shindo, Hiroyuki, Watanabe, Taro
–arXiv.org Artificial Intelligence
Adapting named entity recognition (NER) methods to new domains poses significant challenges. We introduce RapidNER, a framework designed for the rapid deployment of NER systems through efficient dataset construction. RapidNER operates through three key steps: (1) extracting domain-specific sub-graphs and triples from a general knowledge graph, (2) collecting and leveraging texts from various sources to build the NERsocial dataset, which focuses on entities typical in human-robot interaction, and (3) implementing an annotation scheme using Elasticsearch (ES) to enhance efficiency. NERsocial, validated by human annotators, includes six entity types, 153K tokens, and 99.4K sentences, demonstrating RapidNER's capability to expedite dataset creation.
arXiv.org Artificial Intelligence
Nov-27-2024
- Country:
- Oceania > Fiji (0.04)
- South America
- Peru (0.04)
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- North America
- United States
- Virginia (0.04)
- Maine (0.04)
- New York (0.04)
- Tennessee (0.04)
- Colorado (0.04)
- Kentucky (0.04)
- California (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Washington > King County
- Seattle (0.04)
- Cuba > La Habana Province
- Havana (0.04)
- Canada > British Columbia
- United States
- Europe
- Spain > Aragón (0.04)
- Greece (0.04)
- Austria (0.04)
- Poland (0.04)
- Italy (0.04)
- Middle East > Cyprus (0.04)
- Czechia > Prague (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Netherlands
- South Holland > Dordrecht (0.04)
- North Brabant > 's-Hertogenbosch (0.04)
- Friesland (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- San Marino > Fiorentino
- Fiorentino (0.04)
- Asia
- India (0.04)
- Singapore (0.04)
- China (0.04)
- Japan > Kyūshū & Okinawa
- Okinawa (0.04)
- Indonesia
- Sumatra > West Sumatra (0.04)
- Sulawesi > South Sulawesi
- Makassar (0.04)
- Java > West Java
- Sumedang (0.04)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Information Technology (1.00)
- Banking & Finance (1.00)
- Education (0.92)
- Law Enforcement & Public Safety (0.67)
- Government > Military (0.67)
- Retail (0.67)
- Consumer Products & Services
- Food, Beverage, Tobacco & Cannabis > Beverages (1.00)
- Restaurants (0.67)
- Leisure & Entertainment
- Games (1.00)
- Sports
- Motorsports (1.00)
- Martial Arts (1.00)
- Health & Medicine
- Therapeutic Area (1.00)
- Consumer Health (1.00)
- Transportation
- Media
- Technology: