AITopics | sentence simplification

Collaborating Authors

sentence simplification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Policy-based Sentence Simplification: Replacing Parallel Corpora with LLM-as-a-Judge

Wu, Xuanxin, Arase, Yuki, Nagata, Masaaki

arXiv.org Artificial IntelligenceDec-9-2025

Sentence simplification aims to modify a sentence to make it easier to read and understand while preserving the meaning. Different applications require distinct simplification policies, such as replacing only complex words at the lexical level or rewriting the entire sentence while trading off details for simplicity. However, achieving such policy-driven control remains an open challenge. In this work, we introduce a simple yet powerful approach that leverages Large Language Model-as-a-Judge (LLM-as-a-Judge) to automatically construct policy-aligned training data, completely removing the need for costly human annotation or parallel corpora. Our method enables building simplification systems that adapt to diverse simplification policies. Sentence simplification could benefit users with reading difficulties, such as second-language (L2) learners and people with reading impairments (e.g., dyslexic individuals), by making text easier to read and understand (Alva-Manchego et al., 2020b). It involves a series of edits, such as lexical paraphrasing, sentence splitting, and removing irrelevant details (Xu et al., 2015). The preferred edit policy, i.e., permissible or appropriate edits in given texts, varies significantly depending on the target audience. In L2 education, one of the major application areas for simplification, previous work in both NLP and language education research has shown that the desired type and degree of simplification edits change depending on learner proficiency and readability levels (Agrawal et al., 2021; Zhong et al., 2020). Specifically, low-to intermediate-level learners benefit from a combination of lexical paraphrasing, structural modifications, and selective deletions to reduce cognitive load. In contrast, advanced learners benefit from lexical paraphrasing, which supports vocabulary acquisition (Chen, 2019), but they gain comparatively less from added cohesion or deletion (Hosoda, 2016; Zhong et al., 2020). Motivated by these findings, we introduce two distinct edit policies. As illustrated in Table 1, overall-rewriting simplification often combines lexical paraphrasing, structural modifications, and deletions to improve readability for intermediate-level language learners. In contrast, lexical-paraphrasing (Paetzold & Specia, 2016; Li et al., 2025) adheres to the original sentence closely while supporting more efficient vocabulary acquisition for advanced learners.

large language model, machine learning, simplification, (18 more...)

arXiv.org Artificial Intelligence

2512.06228

Country:

North America > United States (0.46)
North America > Mexico (0.28)
Asia > Japan (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

A Hybrid Multi-Agent Prompting Approach for Simplifying Complex Sentences

Zunjare, Pratibha, Hsiao, Michael

arXiv.org Artificial IntelligenceJun-18-2025

--This paper addresses the challenge of transforming complex sentences into sequences of logical, simplified sentences while preserving semantic and logical integrity with the help of Large Language Models. We propose a hybrid approach that combines advanced prompting with multi-agent architectures to enhance the sentence simplification process. Experimental results show that our approach was able to successfully simplify 70% of the complex sentences written for video game design application. In comparison, a single-agent approach attained a 48% success rate on the same task. Sentence simplification is a challenging task in computational linguistics. The simplification process aims to transform complex sentences into simpler structures while preserving the original meaning. Effective sentence simplification has significant applications across numerous domains like education, content accessibility for individuals with cognitive disabilities, automated content creation, robotics, coding, legal documents, etc. Traditional approaches to sentence simplification have relied on rule-based systems, statistical methods, and more recently neural network architectures [1]. Complex sentences present significant challenges in action-oriented contexts, particularly when attempting to derive executable/actionable functionalities such as robotics, legal documents, and video games.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.11681

Country:

North America > United States (0.28)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

SimplifyMyText: An LLM-Based System for Inclusive Plain Language Text Simplification

Färber, Michael, Aghdam, Parisa, Im, Kyuri, Tawfelis, Mario, Ghoshal, Hardik

arXiv.org Artificial IntelligenceApr-22-2025

Text simplification is essential for making complex content accessible to diverse audiences who face comprehension challenges. Yet, the limited availability of simplified materials creates significant barriers to personal and professional growth and hinders social inclusion. Although researchers have explored various methods for automatic text simplification, none fully leverage large language models (LLMs) to offer tailored customization for different target groups and varying levels of simplicity. Moreover, despite its proven benefits for both consumers and organizations, the well-established practice of plain language remains underutilized. In this paper, we https://simplifymytext.org, the first system designed to produce plain language content from multiple input formats, including typed text and file uploads, with flexible customization options for diverse audiences. We employ GPT-4 and Llama-3 and evaluate outputs across multiple metrics. Overall, our work contributes to research on automatic text simplification and highlights the importance of tailored communication in promoting inclusivity.

large language model, machine learning, simplification, (15 more...)

arXiv.org Artificial Intelligence

2504.14223

Country:

Europe > Germany (0.29)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.40)

Industry: Government (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Aligning Sentence Simplification with ESL Learner's Proficiency for Language Acquisition

Li, Guanlin, Arase, Yuki, Crespi, Noel

arXiv.org Artificial IntelligenceFeb-17-2025

Text simplification is crucial for improving accessibility and comprehension for English as a Second Language (ESL) learners. This study goes a step further and aims to facilitate ESL learners' language acquisition by simplification. Specifically, we propose simplifying complex sentences to appropriate levels for learners while also increasing vocabulary coverage of the target level in the simplifications. We achieve this without a parallel corpus by conducting reinforcement learning on a large language model. Our method employs token-level and sentence-level rewards, and iteratively trains the model on its self-generated outputs to guide the model to search for simplification hypotheses that satisfy the target attributes. Experiment results on CEFR-SP and TurkCorpus datasets show that the proposed method can effectively increase the frequency and diversity of vocabulary of the target level by more than $20\%$ compared to baseline models, while maintaining high simplification quality.

large language model, machine learning, simplification, (18 more...)

arXiv.org Artificial Intelligence

2502.11457

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SiTSE: Sinhala Text Simplification Dataset and Evaluation

Ranathunga, Surangika, Sirithunga, Rumesh, Rathnayake, Himashi, De Silva, Lahiru, Aluthwala, Thamindu, Peramuna, Saman, Shekhar, Ravi

arXiv.org Artificial IntelligenceDec-2-2024

Text Simplification is a task that has been minimally explored for low-resource languages. Consequently, there are only a few manually curated datasets. In this paper, we present a human curated sentence-level text simplification dataset for the Sinhala language. Our evaluation dataset contains 1,000 complex sentences and corresponding 3,000 simplified sentences produced by three different human annotators. We model the text simplification task as a zero-shot and zero resource sequence-to-sequence (seq-seq) task on the multilingual language models mT5 and mBART. We exploit auxiliary data from related seq-seq tasks and explore the possibility of using intermediate task transfer learning (ITTL). Our analysis shows that ITTL outperforms the previously proposed zero-resource methods for text simplification. Our findings also highlight the challenges in evaluating text simplification systems, and support the calls for improved metrics for measuring the quality of automated text simplification systems that would suit low-resource languages as well. Our code and data are publicly available: https://github.com/brainsharks-fyp17/Sinhala-Text-Simplification-Dataset-and-Evaluation

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.01293

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Sri Lanka (0.05)
Europe > United Kingdom > England > Essex (0.04)
(8 more...)

Genre: Research Report (0.84)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Label Confidence Weighted Learning for Target-level Sentence Simplification

Qiu, Xinying, Zhang, Jingshen

arXiv.org Artificial IntelligenceOct-8-2024

Multi-level sentence simplification generates simplified sentences with varying language proficiency levels. We propose Label Confidence Weighted Learning (LCWL), a novel approach that incorporates a label confidence weighting scheme in the training loss of the encoder-decoder model, setting it apart from existing confidence-weighting methods primarily designed for classification. Experimentation on English grade-level simplification dataset shows that LCWL outperforms state-of-the-art unsupervised baselines. Fine-tuning the LCWL model on in-domain data and combining with Symmetric Cross Entropy (SCE) consistently delivers better simplifications compared to strong supervised methods. Our results highlight the effectiveness of label confidence weighting techniques for text simplification tasks with encoder-decoder architectures.

classifier, lcwl, simplification, (15 more...)

arXiv.org Artificial Intelligence

2410.05748

Country:

North America > United States > Kentucky (0.05)
Asia > Middle East > Iraq (0.05)
North America > United States > Ohio (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (0.66)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Edit-Constrained Decoding for Sentence Simplification

Zetsu, Tatsuya, Arase, Yuki, Kajiwara, Tomoyuki

arXiv.org Artificial IntelligenceSep-28-2024

We propose edit operation based lexically constrained decoding for sentence simplification. In sentence simplification, lexical paraphrasing is one of the primary procedures for rewriting complex sentences into simpler correspondences. While previous studies have confirmed the efficacy of lexically constrained decoding on this task, their constraints can be loose and may lead to sub-optimal generation. We address this problem by designing constraints that replicate the edit operations conducted in simplification and defining stricter satisfaction conditions. Our experiments indicate that the proposed method consistently outperforms the previous studies on three English simplification corpora commonly used in this task.

artificial intelligence, constraint, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.19247

Country:

Asia > Singapore (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > China > Heilongjiang Province > Daqing (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

Add feedback

Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese

Zhang, Jingshen, Chen, Xinglu, Qiu, Xinying, Wang, Zhimin, Feng, Wenhe

arXiv.org Artificial IntelligenceJun-5-2024

Chinese sentence simplification faces challenges due to the lack of large-scale labeled parallel corpora and the prevalence of idioms. To address these challenges, we propose Readability-guided Idiom-aware Sentence Simplification (RISS), a novel framework that combines data augmentation techniques with lexcial simplification. RISS introduces two key components: (1) Readability-guided Paraphrase Selection (RPS), a method for mining high-quality sentence pairs, and (2) Idiom-aware Simplification (IAS), a model that enhances the comprehension and simplification of idiomatic expressions. By integrating RPS and IAS using multi-stage and multi-task learning strategies, RISS outperforms previous state-of-the-art methods on two Chinese sentence simplification datasets. Furthermore, RISS achieves additional improvements when fine-tuned on a small labeled dataset. Our approach demonstrates the potential for more effective and accessible Chinese text simplification.

dataset, idiom, simplification, (13 more...)

arXiv.org Artificial Intelligence

2406.02974

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > Netherlands (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.35)

Add feedback

An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment

Wu, Xuanxin, Arase, Yuki

arXiv.org Artificial IntelligenceMar-7-2024

Sentence simplification, which rewrites a sentence to be easier to read and understand, is a promising technique to help people with various reading difficulties. With the rise of advanced large language models (LLMs), evaluating their performance in sentence simplification has become imperative. Recent studies have used both automatic metrics and human evaluations to assess the simplification abilities of LLMs. However, the suitability of existing evaluation methodologies for LLMs remains in question. First, the suitability of current automatic metrics on LLMs' simplification evaluation is still uncertain. Second, current human evaluation approaches in sentence simplification often fall into two extremes: they are either too superficial, failing to offer a clear understanding of the models' performance, or overly detailed, making the annotation process complex and prone to inconsistency, which in turn affects the evaluation's reliability. To address these problems, this study provides in-depth insights into LLMs' performance while ensuring the reliability of the evaluation. We design an error-based human annotation framework to assess the GPT-4's simplification capabilities. Results show that GPT-4 generally generates fewer erroneous simplification outputs compared to the current state-of-the-art. However, LLMs have their limitations, as seen in GPT-4's struggles with lexical paraphrasing. Furthermore, we conduct meta-evaluations on widely used automatic metrics using our human annotations. We find that while these metrics are effective for significant quality differences, they lack sufficient sensitivity to assess the overall high-quality simplification by GPT-4.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.04963

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
(26 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.68)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BoschAI @ PLABA 2023: Leveraging Edit Operations in End-to-End Neural Sentence Simplification

Knappich, Valentin, Razniewski, Simon, Friedrich, Annemarie

arXiv.org Artificial IntelligenceDec-6-2023

Automatic simplification can help laypeople to comprehend complex scientific text. Language models are frequently applied to this task by translating from complex to simple language. In this paper, we describe our system based on Llama 2, which ranked first in the PLABA shared task addressing the simplification of biomedical text. We find that the large portion of shared tokens between input and output leads to weak training signals and conservatively editing models. To mitigate these issues, we propose sentence-level and token-level loss weights. They give higher weight to modified tokens, indicated by edit distance and edit operations, respectively. We conduct an empirical evaluation on the PLABA dataset and find that both approaches lead to simplifications closer to those created by human annotators (+1.8% / +3.5% SARI), simpler language (-1 / -1.1 FKGL) and more edits (1.6x / 1.8x edit distance) compared to the same model fine-tuned with standard cross entropy. We furthermore show that the hyperparameter $\lambda$ in token-level loss weights can be used to control the edit distance and the simplicity level (FKGL).

edit distance, loss weight, simplification, (12 more...)

arXiv.org Artificial Intelligence

2311.01907

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.32)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback