FlowDC: Flow-Based Decoupling-Decay for Complex Image Editing

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

In multi-objective text-driven image editing, simultaneously achieving semantic alignment and source-image consistency remains challenging. To address this, we propose a decoupling-and-attenuation framework: first, decomposing complex edits into parallel sub-editing tasks; second, under the flow-matching paradigm, orthogonally decomposing the motion velocity field and explicitly attenuating components that disrupt source structure. This novel synergy between edit decoupling and orthogonal velocity attenuation enables high-quality, single-pass multi-objective editing. Our contributions are threefold: (1) We introduce Complex-PIE-Bench—the first benchmark specifically designed for multi-objective image editing; (2) Our method achieves state-of-the-art performance on both Complex-PIE-Bench and PIE-Bench, significantly outperforming prior approaches; (3) It attains a superior trade-off between semantic accuracy and source-image fidelity, demonstrating robust structural preservation while faithfully realizing diverse textual instructions.

Technology Category

Application Category

📝 Abstract

With the surge of pre-trained text-to-image flow matching models, text-based image editing performance has gained remarkable improvement, especially for underline{simple editing} that only contains a single editing target. To satisfy the exploding editing requirements, the underline{complex editing} which contains multiple editing targets has posed as a more challenging task. However, current complex editing solutions: single-round and multi-round editing are limited by long text following and cumulative inconsistency, respectively. Thus, they struggle to strike a balance between semantic alignment and source consistency. In this paper, we propose extbf{FlowDC}, which decouples the complex editing into multiple sub-editing effects and superposes them in parallel during the editing process. Meanwhile, we observed that the velocity quantity that is orthogonal to the editing displacement harms the source structure preserving. Thus, we decompose the velocity and decay the orthogonal part for better source consistency. To evaluate the effectiveness of complex editing settings, we construct a complex editing benchmark: Complex-PIE-Bench. On two benchmarks, FlowDC shows superior results compared with existing methods. We also detail the ablations of our module designs.

Problem

Research questions and friction points this paper is trying to address.

Addresses complex image editing with multiple targets

Decouples editing into parallel sub-effects for superposition

Decomposes velocity to preserve source structure consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples complex editing into parallel sub-effects

Decomposes velocity and decays orthogonal component

Constructs benchmark for complex editing evaluation

🔎 Similar Papers

Streamlining Image Editing with Layered Diffusion Brushes