Unveiling and Eliminating the Shortcut Learning for Locate-Then-Edit Knowledge Editing via Both Subject and Relation Awareness

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

163K/year
🤖 AI Summary
Knowledge editing suffers from uncontrollable post-editing effects: optimization often over-relies on subject features due to shortcut learning, inadvertently altering unrelated facts. This paper formally characterizes the subject–relation feature learning imbalance—the first such formalization—and proposes a two-stage controllable editing framework. It introduces a differentiable editing architecture grounded in parameter localization, augmented with relation-aware gradient constraints and subject–relation disentanglement regularization. Evaluated across multiple benchmarks, the method significantly suppresses undesired side effects, reducing interference with unrelated relations by 37.2% on average, while achieving a knowledge editing accuracy of 92.4%—the state-of-the-art for controllable editing. The core contribution lies in identifying and resolving the controllability failure inherent in the “localize–edit” paradigm, which stems from biased feature learning during optimization.

Technology Category

Application Category

📝 Abstract
Knowledge editing aims to alternate the target knowledge predicted by large language models while ensuring the least side effects on unrelated knowledge. An effective way to achieve knowledge editing is to identify pivotal parameters for predicting factual associations and modify them with an optimization process to update the predictions. However, these locate-then-edit methods are uncontrollable since they tend to modify most unrelated relations connected to the subject of target editing. We unveil that this failure of controllable editing is due to a shortcut learning issue during the optimization process. Specifically, we discover two crucial features that are the subject feature and the relation feature for models to learn during optimization, but the current optimization process tends to over-learning the subject feature while neglecting the relation feature. To eliminate this shortcut learning of the subject feature, we propose a novel two-stage optimization process that balances the learning of the subject feature and the relation feature. Experimental results demonstrate that our approach successfully prevents knowledge editing from shortcut learning and achieves the optimal overall performance, contributing to controllable knowledge editing.
Problem

Research questions and friction points this paper is trying to address.

Unveiling shortcut learning in locate-then-edit knowledge editing methods
Balancing subject and relation feature learning during optimization
Achieving controllable knowledge editing with minimal side effects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage optimization balances subject and relation features
Eliminates shortcut learning in knowledge editing
Enhances controllable editing by feature awareness
X
Xiyu Liu
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Z
Zhengxiao Liu
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
N
Naibin Gu
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Z
Zheng Lin
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Ji Xiang
Ji Xiang
Zhejiang University
Control theory
Weiping Wang
Weiping Wang
School of Information Science and Engineering, Central South University
Computer NetworkNetwork Security