🤖 AI Summary
Existing high-fidelity 3D generation methods face challenges in modeling arbitrary topologies (e.g., open surfaces, non-manifold structures), require watertight preprocessing for signed distance fields (SDFs), and suffer from sampling artifacts in point-cloud representations. To address these issues, this paper proposes a local-to-global (LoG) generative architecture based on unsigned distance fields (UDFs). Our key contributions are: (1) a UBlock tiling mechanism with Pad-Average strategy enabling stable modeling at ultra-high resolution (2048³); (2) hybrid geometric modeling combining 3D convolutions for local geometric detail capture and sparse Transformers for global structural coherence; and (3) an end-to-end variational autoencoder training framework. Experiments demonstrate state-of-the-art performance in reconstruction accuracy, surface smoothness, and topological flexibility—achieving, for the first time, high-resolution, high-quality unified modeling of complex non-manifold and open-surface geometries.
📝 Abstract
Generating high-fidelity 3D contents remains a fundamental challenge due to the complexity of representing arbitrary topologies-such as open surfaces and intricate internal structures-while preserving geometric details. Prevailing methods based on signed distance fields (SDFs) are hampered by costly watertight preprocessing and struggle with non-manifold geometries, while point-cloud representations often suffer from sampling artifacts and surface discontinuities. To overcome these limitations, we propose a novel 3D variational autoencoder (VAE) framework built upon unsigned distance fields (UDFs)-a more robust and computationally efficient representation that naturally handles complex and incomplete shapes. Our core innovation is a local-to-global (LoG) architecture that processes the UDF by partitioning it into uniform subvolumes, termed UBlocks. This architecture couples 3D convolutions for capturing local detail with sparse transformers for enforcing global coherence. A Pad-Average strategy further ensures smooth transitions at subvolume boundaries during reconstruction. This modular design enables seamless scaling to ultra-high resolutions up to 2048^3-a regime previously unattainable for 3D VAEs. Experiments demonstrate state-of-the-art performance in both reconstruction accuracy and generative quality, yielding superior surface smoothness and geometric flexibility.