Reuse or Generate? Accelerating Code Editing via Edit-Oriented Speculative Decoding

📅 2025-06-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the slow decoding speed, neglect of locality, and underutilization of code reuse in large language models (LLMs) for code editing tasks, this paper proposes the first editing-oriented speculative decoding framework. Our method integrates an edit-aware lightweight draft model, a dynamic verifier, and a locality-aware edit localization strategy—explicitly modeling the localized nature of code changes and intelligently reusing original code segments to generate high-quality edit drafts. Unlike conventional speculative decoding, our framework is co-designed at both architectural and mechanistic levels specifically for code editing. Experiments on CanItEdit and CodeIF-Bench demonstrate 10.38× and 13.09× decoding speedups, respectively, achieving up to 90.6% improvement over the state-of-the-art acceleration methods while preserving edit accuracy.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities in code editing, substantially enhancing software development productivity. However, the inherent complexity of code editing tasks forces existing approaches to rely on LLMs' autoregressive end-to-end generation, where decoding speed plays a critical role in efficiency. While inference acceleration techniques like speculative decoding are applied to improve the decoding efficiency, these methods fail to account for the unique characteristics of code editing tasks where changes are typically localized and existing code segments are reused. To address this limitation, we propose EfficientEdit, a novel method that improves LLM-based code editing efficiency through two key mechanisms based on speculative decoding: (1) effective reuse of original code segments while identifying potential edit locations, and (2) efficient generate edit content via high-quality drafts from edit-oriented draft models and a dynamic verification mechanism that balances quality and acceleration. Experimental results show that EfficientEdit can achieve up to 10.38$ imes$ and 13.09$ imes$ speedup compared to standard autoregressive decoding in CanItEdit and CodeIF-Bench, respectively, outperforming state-of-the-art inference acceleration approaches by up to 90.6%.
Problem

Research questions and friction points this paper is trying to address.

Accelerating code editing via speculative decoding
Reusing existing code segments for efficiency
Balancing quality and speed in edit generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reuses original code segments effectively
Generates edits via draft models
Balances quality with dynamic verification
🔎 Similar Papers
No similar papers found.
P
Peiding Wang
Beihang University
L
Li Zhang
Beihang University
F
Fang Liu
Beihang University
Yinghao Zhu
Yinghao Zhu
The University of Hong Kong
Data MiningAI for Healthcare
Wang Xu
Wang Xu
Harbin Institute of Technology
natural language processingartificial intelligence
Lin Shi
Lin Shi
Beihang University
Software Engineering
X
Xiaoli Lian
Beihang University
M
Minxiao Li
Beihang University
B
Bo Shen
Huawei Cloud Computing Technologies Co., Ltd.
A
An Fu
Huawei Cloud Computing Technologies Co., Ltd.