Policy-based Sentence Simplification: Replacing Parallel Corpora with LLM-as-a-Judge

Wu, Xuanxin, Arase, Yuki, Nagata, Masaaki

Dec-9-2025–arXiv.org Artificial Intelligence

Sentence simplification aims to modify a sentence to make it easier to read and understand while preserving the meaning. Different applications require distinct simplification policies, such as replacing only complex words at the lexical level or rewriting the entire sentence while trading off details for simplicity. However, achieving such policy-driven control remains an open challenge. In this work, we introduce a simple yet powerful approach that leverages Large Language Model-as-a-Judge (LLM-as-a-Judge) to automatically construct policy-aligned training data, completely removing the need for costly human annotation or parallel corpora. Our method enables building simplification systems that adapt to diverse simplification policies. Sentence simplification could benefit users with reading difficulties, such as second-language (L2) learners and people with reading impairments (e.g., dyslexic individuals), by making text easier to read and understand (Alva-Manchego et al., 2020b). It involves a series of edits, such as lexical paraphrasing, sentence splitting, and removing irrelevant details (Xu et al., 2015). The preferred edit policy, i.e., permissible or appropriate edits in given texts, varies significantly depending on the target audience. In L2 education, one of the major application areas for simplification, previous work in both NLP and language education research has shown that the desired type and degree of simplification edits change depending on learner proficiency and readability levels (Agrawal et al., 2021; Zhong et al., 2020). Specifically, low-to intermediate-level learners benefit from a combination of lexical paraphrasing, structural modifications, and selective deletions to reduce cognitive load. In contrast, advanced learners benefit from lexical paraphrasing, which supports vocabulary acquisition (Chen, 2019), but they gain comparatively less from added cohesion or deletion (Hosoda, 2016; Zhong et al., 2020). Motivated by these findings, we introduce two distinct edit policies. As illustrated in Table 1, overall-rewriting simplification often combines lexical paraphrasing, structural modifications, and deletions to improve readability for intermediate-level language learners. In contrast, lexical-paraphrasing (Paetzold & Specia, 2016; Li et al., 2025) adheres to the original sentence closely while supporting more efficient vocabulary acquisition for advanced learners.

large language model, machine learning, simplification, (18 more...)

arXiv.org Artificial Intelligence

Dec-9-2025

arXiv.org PDF

Add feedback

Country:
- Africa > South Africa (0.04)
- Asia
  - China (0.04)
  - India (0.04)
  - Indonesia > Bali (0.04)
  - Japan > Honshū
    - Kansai > Osaka Prefecture
      - Osaka (0.04)
    - Kantō > Tokyo Metropolis Prefecture
      - Tokyo (0.04)
  - Middle East
    - Jordan (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.14)
  - Russia (0.04)
  - Singapore (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
- Europe
  - Bulgaria (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Russia (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - United States
    - Florida > Miami-Dade County
      - Miami (0.04)
    - New Mexico > Bernalillo County
      - Albuquerque (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- South America > Brazil (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.95)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found