Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

May-21-2023–arXiv.org Artificial Intelligence

Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts. It can hurt the performance of pre-trained models on text simplification tasks. In this paper, we propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts. We continue pre-training BART, a representative model, to obtain SimpleBART. It consistently and significantly improves the results on lexical simplification, sentence simplification, and document-level simplification tasks over BART. At the end, we compare SimpleBART with several representative large language models (LLMs).

large language model, machine learning, simplification, (21 more...)

arXiv.org Artificial Intelligence

May-21-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)
- Asia
  - Southeast Asia (0.05)
  - China (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found