CRANE: Knowledge Editing for Reasoning MLLMs

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing knowledge editing methods for reasoning-oriented multimodal large language models (MLLMs) struggle to simultaneously preserve chain-of-thought (CoT) structures, ensure consistency between visual inputs and edited facts, and maintain generalization capability. This work proposes CRANE, a novel framework that identifies three primary failure modes in MLLM knowledge editing and introduces a CoT-aware evaluation protocol alongside the ReasonEdit-Bench benchmark. CRANE employs a modality-aware dual-retrieval mechanism combined with supervised fine-tuning and GRPO reinforcement learning guided by a cognitive-routing reward, enabling dynamic arbitration between visual priors and edited facts without modifying model parameters. Experiments show that CRANE achieves a 96.9% Grounded Success rate on conflicting scenarios in ReasonEdit-Bench, with text and image locality scores of 97.6% and 68.1%, respectively, and attains 87.0% performance on the MMEVOKE benchmark under oracle retrieval conditions.

📝 Abstract

The emergence of reasoning multimodal large language models (MLLMs), which generate explicit chain-of-thought (CoT) reasoning before producing answers, has introduced a new challenge for knowledge editing: methods that appear successful under traditional metrics (teacher-forcing accuracy up to 100%) can fail severely when the model's reasoning process is examined (Grounded Success as low as 0%). We identify three failure modes: (1) Structural Collapse, where weight-modifying methods destroy the CoT format; (2) Cognitive Dissonance, where the model's reasoning chain actively rejects the injected edit fact based on visual evidence; and (3) Shallow Internalization, where methods succeed on exact queries but fail on rephrase or multi-hop variants. On reasoning MLLMs, these modes interact: methods that generalize (FT, LoRA) trigger format collapse, while methods without deep modification cannot generalize. To expose these failures, we propose a CoT-aware evaluation protocol and construct ReasonEdit-Bench, with conflict stratification, multi-level probes, and multi-hop portability tests. We propose CRANE, a retrieval-augmented framework that requires no per-edit parameter modification. CRANE combines a modality-aware dual-library retrieval system with a two-phase training strategy: Supervised Fine-Tuning (SFT) for structural initialization, followed by GRPO with a Cognitive Routing Reward that trains the model to arbitrate between visual priors and injected edit facts. On ReasonEdit-Bench, CRANE achieves 96.9% Grounded Success on conflict scenarios and 96.9% intermediate entity usage in multi-hop chains, with 97.6% text-locality and 68.1% image-locality Edit Independence. On the out-of-distribution MMEVOKE benchmark, CRANE reaches 87.0% under gold retrieval.

Problem

Research questions and friction points this paper is trying to address.

knowledge editing

reasoning MLLMs

chain-of-thought

failure modes

grounded success

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge editing

reasoning MLLMs

chain-of-thought