Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Higashiyama, Shohei, Ouchi, Hiroki, Teranishi, Hiroki, Otomo, Hiroyuki, Ide, Yusuke, Yamamoto, Aitaro, Shindo, Hiroyuki, Matsuda, Yuki, Wakamiya, Shoko, Inoue, Naoya, Yamada, Ikuya, Watanabe, Taro
–arXiv.org Artificial Intelligence
Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coreference clusters, and 2,551 geo-entities linked to geo-database entries.
arXiv.org Artificial Intelligence
May-23-2023
- Country:
- Asia
- China > Beijing
- Beijing (0.04)
- Japan
- Hokkaidō (0.04)
- Honshū
- Kansai
- Kyoto Prefecture > Kyoto (0.05)
- Nara Prefecture (0.04)
- Kantō > Tokyo Metropolis Prefecture
- Tokyo (0.14)
- Kansai
- China > Beijing
- Europe
- France (0.04)
- Germany > Berlin (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Valletta (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Slovenia (0.04)
- Spain > Canary Islands
- Gran Canaria > Las Palmas de Gran Canaria (0.04)
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- North America
- Canada > British Columbia
- United States
- Maryland > Howard County
- Columbia (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Oregon > Multnomah County
- Portland (0.04)
- Maryland > Howard County
- Oceania > Australia
- Asia
- Genre:
- Research Report (1.00)
- Technology: