OpenNER 1.0: Standardized Open-Access Named Entity Recognition Datasets in 50+ Languages
Palen-Michel, Chester, Pickering, Maxwell, Kruse, Maya, Sälevä, Jonne, Lignos, Constantine
–arXiv.org Artificial Intelligence
We present OpenNER 1.0, a standardized collection of openly available named entity recognition (NER) datasets. OpenNER contains 34 datasets spanning 51 languages, annotated in varying named entity ontologies. We correct annotation format issues, standardize the original datasets into a uniform representation, map entity type names to be more consistent across corpora, and provide the collection in a structure that enables research in multilingual and multi-ontology NER. We provide baseline models using three pretrained multilingual language models to compare the performance of recent models and facilitate future research in NER.
arXiv.org Artificial Intelligence
Dec-12-2024
- Country:
- Africa
- Niger (0.06)
- North Africa (0.04)
- Asia
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Middle East
- Israel (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Japan > Kyūshū & Okinawa
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Sweden > Östergötland County
- Linköping (0.04)
- Slovenia (0.04)
- Finland > Southwest Finland
- Turku (0.04)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.05)
- Italy > Tuscany
- Pisa Province > Pisa (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Ireland > Leinster
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Virginia > Fairfax County
- Fairfax (0.04)
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- Canada
- Africa
- Genre:
- Research Report (0.50)
- Technology: