Single Image to High-Quality 3D Object via Latent Features

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of balancing speed, geometric detail, and reconstruction fidelity in single-image 3D reconstruction, this paper proposes LatentDreamer—a novel framework built upon a pretrained variational autoencoder (VAE) that maps 3D geometry into a compact latent space for efficient, high-fidelity 3D generation from a single image. Its key contributions are: (1) a learnable latent feature representation that drastically reduces 3D modeling complexity; and (2) a progressive, serialized generation pipeline—coarse-to-fine geometry followed by texture synthesis—that jointly ensures structural integrity and surface realism. With minimal fine-tuning, LatentDreamer reconstructs high-quality 3D models in approximately 70 seconds per image. It achieves state-of-the-art performance on standard metrics including FID and Chamfer distance, significantly advancing the practicality and scalability of single-image 3D generation.

Technology Category

Application Category

📝 Abstract
3D assets are essential in the digital age. While automatic 3D generation, such as image-to-3d, has made significant strides in recent years, it often struggles to achieve fast, detailed, and high-fidelity generation simultaneously. In this work, we introduce LatentDreamer, a novel framework for generating 3D objects from single images. The key to our approach is a pre-trained variational autoencoder that maps 3D geometries to latent features, which greatly reducing the difficulty of 3D generation. Starting from latent features, the pipeline of LatentDreamer generates coarse geometries, refined geometries, and realistic textures sequentially. The 3D objects generated by LatentDreamer exhibit high fidelity to the input images, and the entire generation process can be completed within a short time (typically in 70 seconds). Extensive experiments show that with only a small amount of training, LatentDreamer demonstrates competitive performance compared to contemporary approachs.
Problem

Research questions and friction points this paper is trying to address.

Generating high-fidelity 3D objects from single images
Achieving fast and detailed 3D generation simultaneously
Reducing difficulty of 3D generation using latent features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained variational autoencoder for 3D latent features
Sequentially generates coarse geometries, refined shapes, textures
Achieves high-fidelity 3D generation within 70 seconds
🔎 Similar Papers
No similar papers found.
H
Huanning Dong
University of Electronic Science and Technology of China, Chengdu, China
Yinuo Huang
Yinuo Huang
Ph.D. Candidate, University of Electronic Science and Technology of China
Wireless communicationsmachine learning
F
Fan Li
University of Electronic Science and Technology of China, Chengdu, China
P
Ping Kuang
University of Electronic Science and Technology of China, Chengdu, China