Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing

📅 2025-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of original knowledge degradation in localization-editing large language models (LLMs), caused by uniform residual distribution during knowledge updates. We propose Boundary-Layer Update (BLUE), a novel editing strategy grounded in theoretical and empirical analysis revealing residual distribution as the primary source of editing error. BLUE abandons uniform residual allocation and instead identifies critical boundary layers via hierarchical knowledge localization. It then employs least-squares-based multi-layer joint optimization coupled with residual reweighting and controlled propagation to achieve precise, low-interference joint editing. Evaluated on three LLM architectures across two benchmark datasets, BLUE achieves an average 35.59% improvement in edit accuracy over state-of-the-art methods, while better preserving both general model capabilities and consistency with pre-edit knowledge.

Technology Category

Application Category

📝 Abstract
Model editing is a powerful technique for updating the knowledge of Large Language Models (LLMs). Locate-then-edit methods are a popular class of approaches that first identify the critical layers storing knowledge, then compute the residual of the last critical layer based on the edited knowledge, and finally perform multi-layer updates using a least-squares solution by evenly distributing the residual from the first critical layer to the last. Although these methods achieve promising results, they have been shown to degrade the original knowledge of LLMs. We argue that residual distribution leads to this issue. To explore this, we conduct a comprehensive analysis of residual distribution in locate-then-edit methods from both empirical and theoretical perspectives, revealing that residual distribution introduces editing errors, leading to inaccurate edits. To address this issue, we propose the Boundary Layer UpdatE (BLUE) strategy to enhance locate-then-edit methods. Sequential batch editing experiments on three LLMs and two datasets demonstrate that BLUE not only delivers an average performance improvement of 35.59%, significantly advancing the state of the art in model editing, but also enhances the preservation of LLMs' general capabilities. Our code is available at https://github.com/xpq-tech/BLUE.
Problem

Research questions and friction points this paper is trying to address.

Analyzes residual distribution in model editing
Identifies errors caused by residual distribution
Proposes BLUE strategy to improve editing accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

BLUE strategy enhances model editing
Analyzes residual distribution errors
Improves LLM knowledge preservation
X
Xiaopeng Li
National University of Defense Technology
S
Shanwen Wang
National University of Defense Technology
S
Shasha Li
National University of Defense Technology
Shezheng Song
Shezheng Song
NUDT
B
Bin Ji
National University of Defense Technology
J
Jun Ma
National University of Defense Technology
J
Jie Yu
National University of Defense Technology