MultiLS-SP/CA: Lexical Complexity Prediction and Lexical Simplification Resources for Catalan and Spanish

Bott, Stefan, Saggion, Horacio, Rojas, Nelson Peréz, Salazar, Martin Solis, Ramirez, Saul Calderon

Apr-11-2024–arXiv.org Artificial Intelligence

Automatic lexical simplification is a task to substitute lexical items that may be unfamiliar and difficult to understand with easier and more common words. This paper presents MultiLS-SP/CA, a novel dataset for lexical simplification in Spanish and Catalan. This dataset represents the first of its kind in Catalan and a substantial addition to the sparse data on automatic lexical simplification which is available for Spanish. Specifically, MultiLS-SP is the first dataset for Spanish which includes scalar ratings of the understanding difficulty of lexical items. In addition, we describe experiments with this dataset, which can serve as a baseline for future work on the same data.

dataset, saggion, simplification, (13 more...)

arXiv.org Artificial Intelligence

Apr-11-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Maryland (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
  - Costa Rica > Cartago Province
    - Cartago (0.04)
- Europe
  - Sweden > Östergötland County
    - Linköping (0.04)
  - Spain
    - Valencian Community > Valencia Province
      - Valencia (0.04)
    - Catalonia > Barcelona Province
      - Barcelona (0.04)
  - Russia > Northwestern Federal District
    - Leningrad Oblast > Saint Petersburg (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Bulgaria > Varna Province
    - Varna (0.04)
- Asia
  - Russia (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Research Report (0.40)

Industry:
- Government (0.67)
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found