A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms

Ghanem, Sana, Jarrar, Mustafa, Jarrar, Radi, Bounhas, Ibrahim

Feb-4-2023–arXiv.org Artificial Intelligence

This paper addresses the task of extending a given synset with additional synonyms taking into account synonymy strength as a fuzzy value. Given a mono/multilingual synset and a threshold (a fuzzy value [0-1]), our goal is to extract new synonyms above this threshold from existing lexicons. We present twofold contributions: an algorithm and a benchmark dataset. The dataset consists of 3K candidate synonyms for 500 synsets. Each candidate synonym is annotated with a fuzzy value by four linguists. The dataset is important for (i) understanding how much linguists (dis/)agree on synonymy, in addition to (ii) using the dataset as a baseline to evaluate our algorithm. Our proposed algorithm extracts synonyms from existing lexicons and computes a fuzzy value for each candidate. Our evaluations show that the algorithm behaves like a linguist and its fuzzy values are close to those proposed by linguists (using RMSE and MAE). The dataset and a demo page are publicly available at https://portal.sina.birzeit.edu/synonyms.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Feb-4-2023

arXiv.org PDF

Add feedback

Country:
- Europe
  - United Kingdom > England
    - Greater London > London (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - France
    - Île-de-France > Paris
      - Paris (0.04)
    - Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
      - Marseille (0.04)
- Asia > Middle East
  - UAE (0.04)
  - Palestine (0.04)
  - Jordan > Amman Governorate
    - Amman (0.04)
- Africa
  - Middle East > Tunisia (0.04)
  - Sudan (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Ontologies (0.94)
  - Machine Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found