O-Edit: Orthogonal Subspace Editing for Language Model Sequential Editing

Oct-15-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) acquire knowledge during pre-training, but over time, this knowledge may become incorrect or outdated, necessitating updates after training. Knowledge editing techniques address this issue without the need for costly re-training. However, most existing methods are designed for single edits, and as the number of edits increases, they often cause a decline in the model's overall performance, posing significant challenges for sequential editing. To overcome this, we propose Orthogonal Subspace Editing, O-Edit. Our approach does not require replaying previously edited data and processes each edit knowledge on time. It can perform thousands of edits on mainstream LLMs, achieving an average performance improvement that is 4.2 times better than existing methods while effectively preserving the model's performance on downstream tasks, all with minimal additional parameter overhead. Large language models (LLMs) are trained on vast amounts of textual data, enabling them to store extensive knowledge about various aspects of the human world, sparking the potential for general artificial intelligence. Given the substantial computational costs of retraining LLMs to address these issues, there has been growing interest in model editing techniques (Yao et al., 2023; Wang et al., 2023a), which aim to update specific content within the model while minimizing computational costs. In this paper, we focus on parameter-modifying editing methods.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-15-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.68)

Genre:
- Research Report > Promising Solution (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Natural Language > Large Language Model (1.00)