🤖 AI Summary
This paper addresses the slow inference speed and poor reconstruction quality of diffusion models for single-image 3D generation—particularly under few-step sampling (e.g., 4–8 steps). We propose an edge-consistency-guided score distillation framework. Our method introduces: (1) an edge-consistency regularization term that constrains score function learning in noisy states to enhance geometric structural stability; (2) adversarial data augmentation to strengthen fine-detail recovery; and (3) lightweight fine-tuning based on a pre-trained diffusion model. Under comparable generation quality, our approach achieves over 20× faster inference than state-of-the-art methods at 4-step sampling. It also sets new benchmarks across key metrics—including Chamfer Distance (CD), F-Score, and Chamfer-L1—demonstrating synergistic optimization of geometric fidelity and surface detail in few-step inference.
📝 Abstract
We present Acc3D to tackle the challenge of accelerating the diffusion process to generate 3D models from single images. To derive high-quality reconstructions through few-step inferences, we emphasize the critical issue of regularizing the learning of score function in states of random noise. To this end, we propose edge consistency, i.e., consistent predictions across the high signal-to-noise ratio region, to enhance a pre-trained diffusion model, enabling a distillation-based refinement of the endpoint score function. Building on those distilled diffusion models, we propose an adversarial augmentation strategy to further enrich the generation detail and boost overall generation quality. The two modules complement each other, mutually reinforcing to elevate generative performance. Extensive experiments demonstrate that our Acc3D not only achieves over a $20 imes$ increase in computational efficiency but also yields notable quality improvements, compared to the state-of-the-arts.