CreatiPoster: Towards Editable and Controllable Multi-Layer Graphic Design Generation

📅 2025-06-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high skill barrier, poor editability, and low-quality multi-layer generation in graphic design, this paper introduces the first end-to-end framework for editable multi-layer poster generation. Methodologically: (1) we propose the first RGBA-aware large vision-language model that directly outputs structured JSON design specifications—including precise layout, layer hierarchy, styling attributes, and per-pixel RGBA channel information; (2) we introduce a foreground-background decoupled generation paradigm, integrating a conditional background synthesis model to enhance visual consistency; (3) we construct the first open-source, copyright-free, 100K-scale multi-layer design benchmark and corpus. Experiments demonstrate that our approach comprehensively outperforms state-of-the-art commercial systems (e.g., Canva Magic Design) in generation quality, editability, and controllability—enabling canvas-level editing, multilingual instruction following, responsive scaling, and dynamic poster generation.

Technology Category

Application Category

📝 Abstract
Graphic design plays a crucial role in both commercial and personal contexts, yet creating high-quality, editable, and aesthetically pleasing graphic compositions remains a time-consuming and skill-intensive task, especially for beginners. Current AI tools automate parts of the workflow, but struggle to accurately incorporate user-supplied assets, maintain editability, and achieve professional visual appeal. Commercial systems, like Canva Magic Design, rely on vast template libraries, which are impractical for replicate. In this paper, we introduce CreatiPoster, a framework that generates editable, multi-layer compositions from optional natural-language instructions or assets. A protocol model, an RGBA large multimodal model, first produces a JSON specification detailing every layer (text or asset) with precise layout, hierarchy, content and style, plus a concise background prompt. A conditional background model then synthesizes a coherent background conditioned on this rendered foreground layers. We construct a benchmark with automated metrics for graphic-design generation and show that CreatiPoster surpasses leading open-source approaches and proprietary commercial systems. To catalyze further research, we release a copyright-free corpus of 100,000 multi-layer designs. CreatiPoster supports diverse applications such as canvas editing, text overlay, responsive resizing, multilingual adaptation, and animated posters, advancing the democratization of AI-assisted graphic design. Project homepage: https://github.com/graphic-design-ai/creatiposter
Problem

Research questions and friction points this paper is trying to address.

Automating high-quality editable multi-layer graphic design generation
Enhancing user control over design assets and editability
Overcoming limitations of template-based commercial design systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates editable multi-layer designs from instructions
Uses RGBA model for precise layer specifications
Conditional background model ensures visual coherence
🔎 Similar Papers
No similar papers found.
Z
Zhao Zhang
ByteDance, Intelligent Creation
Y
Yutao Cheng
ByteDance, Intelligent Creation
Dexiang Hong
Dexiang Hong
Bytedance.Inc
Computer VisionDeep LearningDiffusion Model
M
Maoke Yang
ByteDance, Intelligent Creation
G
Gonglei Shi
ByteDance, Intelligent Creation
L
Lei Ma
ByteDance, Intelligent Creation
H
Hui Zhang
ByteDance, Fudan University
Jie Shao
Jie Shao
Professor, University of Electronic Science and Technology of China
MultimediaDatabase
Xinglong Wu
Xinglong Wu
字节跳动算法工程师
人工智能