🤖 AI Summary
Existing knowledge editing methods for large language models (LLMs) fail in multi-hop reasoning tasks because they only modify shallow parameters, leaving deep implicit subject representations—critical for multi-hop inference—unupdated, thereby causing factual recall failure. Method: This paper first identifies a hierarchical distinction between single-hop and multi-hop knowledge retrieval, revealing that multi-hop reasoning heavily relies on deep implicit subject representations encoded in MLP layers. Building on this insight, we propose IFMET: an Integrated Framework for Multi-hop Editing and Tracking that combines mechanistic interpretability analysis for hierarchical knowledge localization, jointly edits shallow attention and deep MLP parameters, and introduces multi-hop–specific editing prompts. Contribution/Results: Experiments demonstrate that IFMET significantly outperforms state-of-the-art locate-then-edit approaches on multi-hop factual recall benchmarks, effectively mitigating edit-induced knowledge forgetting across reasoning hops and establishing a novel paradigm for structured knowledge editing in LLMs.
📝 Abstract
The locate-then-edit paradigm has shown significant promise for knowledge editing (KE) in Large Language Models (LLMs). While previous methods perform well on single-hop fact recall tasks, they consistently struggle with multi-hop factual recall tasks involving newly edited knowledge. In this paper, leveraging tools in mechanistic interpretability, we first identify that in multi-hop tasks, LLMs tend to retrieve knowledge with implicit subject information from deeper MLP layers, unlike single-hop tasks, which rely on shallow layers. This distinction explains the poor performance of current methods in multi-hop queries, as they primarily focus on editing shallow layers with single-hop edit prompts, leaving deeper layers unchanged. To address this, we propose IFMET, a novel locate-then-edit KE approach designed to edit both shallow and deep MLP layers. Beyond single-hop editing prompts, IFMET further incorporates multi-hop editing prompts to locate and modify knowledge across different stages of reasoning. Experimental results demonstrate that IFMET significantly improves performance on multi-hop factual recall tasks, overcoming the limitations of previous locate-then-edit methods