SubRegWeigh: Effective and Efficient Annotation Weighing with Subword Regularization
Tsuji, Kohei, Hiraoka, Tatsuya, Cheng, Yuchang, Iwakura, Tomoya
–arXiv.org Artificial Intelligence
Such a method to weigh annotation errors is recently Various NLP tasks exploit the pair of the raw text studied in the NER field. Wang et al. (2019) and the annotation label for training and evaluating proposed CrossWeigh which is the method for detecting models. For example of named entity recognition annotation errors in the dataset and adjusting (NER), which is applied to various practical technologies their learning priority by weighting loss values such as location detection (Inkpen et al., so that the training is not affected by such annotation 2017) and anonymization (Mamede et al., 2016), errors. However, there are shortcomings some parts of the text are annotated as named entities in its computational efficiency, especially in the (e.g., location names or personal names). And recent NLP trends with the pre-trained large language then, a model is trained to extract these entities models. We consider that the more efficient from the raw text. To achieve higher performance methods of annotation weighing can speed up the in NLP tasks, the models should be trained or finetuned development of NLP. In addition, reducing the computational with a sophisticated training dataset without cost contributes to Green AI (Schwartz annotation errors.
arXiv.org Artificial Intelligence
Sep-10-2024
- Country:
- Asia
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- Germany > Berlin (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Ireland (0.04)
- Denmark > Capital Region
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Canada > Ontario
- Oceania > Australia
- Genre:
- Research Report (0.82)
- Technology: