Event-Arguments Extraction Corpus and Modeling using BERT for Arabic
Aljabari, Alaa, Duaibes, Lina, Jarrar, Mustafa, Khalilia, Mohammed
–arXiv.org Artificial Intelligence
Event-argument extraction is a challenging task, particularly in Arabic due to sparse linguistic resources. To fill this gap, we introduce the \hadath corpus ($550$k tokens) as an extension of Wojood, enriched with event-argument annotations. We used three types of event arguments: $agent$, $location$, and $date$, which we annotated as relation types. Our inter-annotator agreement evaluation resulted in $82.23\%$ $Kappa$ score and $87.2\%$ $F_1$-score. Additionally, we propose a novel method for event relation extraction using BERT, in which we treat the task as text entailment. This method achieves an $F_1$-score of $94.01\%$. To further evaluate the generalization of our proposed method, we collected and annotated another out-of-domain corpus (about $80$k tokens) called \testNLI and used it as a second test set, on which our approach achieved promising results ($83.59\%$ $F_1$-score). Last but not least, we propose an end-to-end system for event-arguments extraction. This system is implemented as part of SinaTools, and both corpora are publicly available at {\small \url{https://sina.birzeit.edu/wojood}}
arXiv.org Artificial Intelligence
Jul-30-2024
- Country:
- North America > United States
- New York > New York County > New York City (0.04)
- Europe
- France
- Île-de-France > Paris
- Paris (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Île-de-France > Paris
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- France
- Asia
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East
- Palestine > Gaza Strip (0.05)
- UAE
- Dubai Emirate > Dubai (0.04)
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Thailand > Bangkok
- Africa > Middle East
- Egypt (0.14)
- North America > United States
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Government (0.93)
- Technology: