Annotation Guidelines for Corpus Novelties: Part 2 -- Alias Resolution Version 1.0
Amalvy, Arthur, Labatut, Vincent
–arXiv.org Artificial Intelligence
This document aims at providing instructions for the annotation of aliases in the Novelties corpus. The corpus itself will be the object of a separate description. It was constituted mainly to fulfill two goals: in the short term, train and test NLP methods able to handle long texts, and in the longer term, be used to develop Renard [2], a pipeline aiming at extracting character networks from literary fiction. This pipeline includes several processing steps besides alias resolution, including named entity recognition and coreference resolution. Character networks can be used to tackle a number of tasks, including the assessment of literary theories, the level of historicity of a narrative, detecting roles in stories, classifying novels, identify subplots, segment a storyline, summarize a story, design recommendation systems, align narratives, etc. See the detailed survey of Labatut and Bost [6] for more information regarding character networks. There are seldom annotation guidelines for alias resolution in the literature, so the one presented here are designed from scratch, taking into account this application's context.
arXiv.org Artificial Intelligence
Oct-1-2024
- Country:
- Atlantic Ocean > Mediterranean Sea (0.04)
- Europe
- Austria (0.05)
- Denmark (0.04)
- France > Hauts-de-France (0.05)
- North America
- Canada > Newfoundland and Labrador
- Newfoundland (0.04)
- Greenland (0.04)
- Canada > Newfoundland and Labrador
- Genre:
- Research Report (0.40)
- Industry:
- Consumer Products & Services (0.47)
- Technology: