G-SPEED: General SParse Efficient Editing MoDel

Zhang, Haoke, Wang, Yue, Li, Juntao, Zhou, Xiabing, Zhang, Min

Oct-16-2023–arXiv.org Artificial Intelligence

Large Language Models~(LLMs) have demonstrated incredible capabilities in understanding, generating, and manipulating languages. Through human-model interactions, LLMs can automatically understand human-issued instructions and output the expected contents, which can significantly increase working efficiency. In various types of real-world demands, editing-oriented tasks account for a considerable proportion, which involves an interactive process that entails the continuous refinement of existing texts to meet specific criteria. Due to the need for multi-round human-model interaction and the generation of complicated editing tasks, there is an emergent need for efficient general editing models. In this paper, we propose \underline{\textbf{G}}eneral \underline{\textbf{SP}}arse \underline{\textbf{E}}fficient \underline{\textbf{E}}diting Mo\underline{\textbf{D}}el~(\textbf{G-SPEED}), which can fulfill diverse editing requirements through a single model while maintaining low computational costs. Specifically, we first propose a novel unsupervised text editing data clustering algorithm to deal with the data scarcity problem. Subsequently, we introduce a sparse editing model architecture to mitigate the inherently limited learning capabilities of small language models. The experimental outcomes indicate that G-SPEED, with its 508M parameters, can surpass LLMs equipped with 175B parameters. Our code and model checkpoints are available at \url{https://github.com/Banner-Z/G-SPEED}.

dataset, g-speed, proceedings, (14 more...)

arXiv.org Artificial Intelligence

Oct-16-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - Spain (0.04)
  - Portugal (0.04)
  - Netherlands (0.04)
  - Germany (0.04)
  - France (0.04)
  - Belgium (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - China > Jiangsu Province (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Statistical Learning
    - Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found