Scalable Model Editing via Customized Expert Networks

Yao, Zihan, He, Yu, Qi, Tianyu, Li, Ming

Apr-3-2024–arXiv.org Artificial Intelligence

Addressing the issue of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on unrelated samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding neuron for each expert to control the activation state of that expert. Our experiments on two different sizes of open-source large language models, the Llama2 7B and 13B, achieve state-of-the-art results compared to existing mainstream Model Editing methods. Our code is available at https: //github.com/TAL-auroraX/SCEN

customized expert network, scalable model editing

arXiv.org Artificial Intelligence

Apr-3-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.69)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found