π€ AI Summary
Existing sketch-to-3D approaches neglect stroke temporal order, leading to inadequate modeling of design intent and structural logic. This paper introduces the first sequence-aware VR sketch-to-3D generation framework, explicitly modeling stroke timing to enhance geometric fidelity and interaction naturalness. Methodologically: (1) We construct a large-scale, multi-category, temporally ordered VR sketch dataset via an automated pipeline that synthesizes time-stamped sketches; (2) We design a sequence-aware encoder that integrates diffusion-based mechanisms for 3D voxel and mesh generation. Our contributions are threefold: First, we pioneer temporal modeling in sketch-to-3D generation. Second, our method achieves robust generalization from synthetic to real-world sketches with minimal supervision. Third, it significantly outperforms state-of-the-art methods in geometric accuracy, partial-sketch completion, and cross-domain generalization.
π Abstract
VR sketching lets users explore and iterate on ideas directly in 3D, offering a faster and more intuitive alternative to conventional CAD tools. However, existing sketch-to-shape models ignore the temporal ordering of strokes, discarding crucial cues about structure and design intent. We introduce VRSketch2Shape, the first framework and multi-category dataset for generating 3D shapes from sequential VR sketches. Our contributions are threefold: (i) an automated pipeline that generates sequential VR sketches from arbitrary shapes, (ii) a dataset of over 20k synthetic and 900 hand-drawn sketch-shape pairs across four categories, and (iii) an order-aware sketch encoder coupled with a diffusion-based 3D generator. Our approach yields higher geometric fidelity than prior work, generalizes effectively from synthetic to real sketches with minimal supervision, and performs well even on partial sketches. All data and models will be released open-source at https://chenyizi086.github.io/VRSketch2Shape_website.