🤖 AI Summary
Existing LLM-driven UI design tools suffer from two key bottlenecks: difficulty in externalizing design intent and insufficient iterative control. This paper introduces SPEC—a structured, hierarchical UI intermediate representation that explicitly extracts and parameterizes design intent via vision-language models (VLMs) and region segmentation, enabling cross-source composition and three-level controllable editing (global, regional, and component-wise). Integrated with a multi-agent generation architecture, SPEC supports end-to-end parsing from reference UIs to SPEC and subsequent high-fidelity interface generation. Crucially, SPEC is the first representation to encode design intent as an editable specification, transcending prompt-engineering limitations and establishing a human-AI closed-loop collaboration paradigm. Quantitative experiments demonstrate significant improvements in intent reconstruction accuracy. A user study with 16 professional designers confirms SPEC’s superiority over Stitch across intent alignment, design quality, controllability, and overall experience.
📝 Abstract
Large language models (LLMs) promise to accelerate UI design, yet current tools struggle with two fundamentals: externalizing designers' intent and controlling iterative change. We introduce SPEC, a structured, parameterized, hierarchical intermediate representation that exposes UI elements as controllable parameters. Building on SPEC, we present SpecifyUI, an interactive system that extracts SPEC from UI references via region segmentation and vision-language models, composes UIs across multiple sources, and supports targeted edits at global, regional, and component levels. A multi-agent generator renders SPEC into high-fidelity designs, closing the loop between intent expression and controllable generation. Quantitative experiments show SPEC-based generation more faithfully captures reference intent than prompt-based baselines. In a user study with 16 professional designers, SpecifyUI significantly outperformed Stitch on intent alignment, design quality, controllability, and overall experience in human-AI co-creation. Our results position SPEC as a specification-driven paradigm that shifts LLM-assisted design from one-shot prompting to iterative, collaborative workflows.