SketchAssist: A Practical Assistant for Semantic Edits and Precise Local Redrawing

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing image editing methods struggle to simultaneously preserve the sparse structural integrity of line art, enable high-level semantic modifications, and support precise local redrawing—hindering efficiency in digital illustration sketch editing. This paper proposes the first unified editing framework tailored for line art creation. We design an attribute-addition-and-removal chained data generation pipeline and introduce a task-oriented vision–text dual-path Mixture-of-Experts LoRA (MoE-LoRA) architecture, enabling synergistic optimization between semantic-instruction-driven global editing and stroke-guided local redrawing. Key innovations include RGB-channel reuse encoding, a style-aware attribute removal module, and a cross-sequence multi-step editing chain. Our method achieves state-of-the-art performance on both semantic editing and local redrawing tasks, significantly outperforming baselines in instruction adherence, structural fidelity, and style consistency.

Technology Category

Application Category

📝 Abstract
Sketch editing is central to digital illustration, yet existing image editing systems struggle to preserve the sparse, style-sensitive structure of line art while supporting both high-level semantic changes and precise local redrawing. We present SketchAssist, an interactive sketch drawing assistant that accelerates creation by unifying instruction-guided global edits with line-guided region redrawing, while keeping unrelated regions and overall composition intact. To enable this assistant at scale, we introduce a controllable data generation pipeline that (i) constructs attribute-addition sequences from attribute-free base sketches, (ii) forms multi-step edit chains via cross-sequence sampling, and (iii) expands stylistic coverage with a style-preserving attribute-removal model applied to diverse sketches. Building on this data, SketchAssist employs a unified sketch editing framework with minimal changes to DiT-based editors. We repurpose the RGB channels to encode the inputs, enabling seamless switching between instruction-guided edits and line-guided redrawing within a single input interface. To further specialize behavior across modes, we integrate a task-guided mixture-of-experts into LoRA layers, routing by text and visual cues to improve semantic controllability, structural fidelity, and style preservation. Extensive experiments show state-of-the-art results on both tasks, with superior instruction adherence and style/structure preservation compared to recent baselines. Together, our dataset and SketchAssist provide a practical, controllable assistant for sketch creation and revision.
Problem

Research questions and friction points this paper is trying to address.

Enables high-level semantic edits and precise local redrawing in sketches.
Preserves sparse, style-sensitive structure and overall composition during edits.
Unifies instruction-guided global edits with line-guided region redrawing interactively.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies instruction-guided global edits with line-guided region redrawing
Introduces a controllable data generation pipeline for multi-step edit chains
Integrates task-guided mixture-of-experts into LoRA layers for improved controllability
🔎 Similar Papers
No similar papers found.
Han Zou
Han Zou
Meta
Multimodal AI
Y
Yan Zhang
Global Business Unit, Baidu Inc.
R
Ruiqi Yu
Global Business Unit, Baidu Inc.
Cong Xie
Cong Xie
ByteDance Inc.,University of Illinois at Urbana-Champaign
Distributed Machine Learning
J
Jie Huang
Global Business Unit, Baidu Inc.
Z
Zhenpeng Zhan
Global Business Unit, Baidu Inc.