🤖 AI Summary
Existing 3D building reconstruction methods struggle to simultaneously preserve surface detail fidelity and ensure physical consistency for LiDAR point clouds corrupted by strong noise and exhibiting non-uniform point density. To address this, we propose OCCDiff, the first framework to introduce latent diffusion models into occupancy function space modeling. It employs a functional autoencoder to yield continuous, differentiable implicit field representations. Furthermore, we design a conditional, multi-modal feature injection mechanism that integrates a point cloud encoder with multi-task training, significantly enhancing robustness against noise and sparsity. OCCDiff enables high-fidelity reconstruction at arbitrary resolutions and consistently outperforms state-of-the-art methods across diverse noise levels and point densities. Quantitative and qualitative evaluations demonstrate its superior geometric accuracy, physical plausibility, and generalization capability.
📝 Abstract
A major challenge in reconstructing buildings from LiDAR point clouds lies in accurately capturing building surfaces under varying point densities and noise interference. To flexibly gather high-quality 3D profiles of the building in diverse resolution, we propose OCCDiff applying latent diffusion in the occupancy function space. Our OCCDiff combines a latent diffusion process with a function autoencoder architecture to generate continuous occupancy functions evaluable at arbitrary locations. Moreover, a point encoder is proposed to provide condition features to diffusion learning, constraint the final occupancy prediction for occupancy decoder, and insert multi-modal features for latent generation to latent encoder. To further enhance the model performance, a multi-task training strategy is employed, ensuring that the point encoder learns diverse and robust feature representations. Empirical results show that our method generates physically consistent samples with high fidelity to the target distribution and exhibits robustness to noisy data.