Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion

📅 2024-12-18
🏛️ arXiv.org
📈 Citations: 10
Influential: 0
📄 PDF
🤖 AI Summary
Depth completion aims to reconstruct dense depth maps from sparse, irregularly distributed depth observations; however, existing methods suffer from poor generalization, especially under cross-domain settings and extremely low-density inputs (<1%). This paper proposes the first zero-shot depth completion framework: it reformulates the task as an image-conditioned generative problem and introduces a pre-trained monocular depth latent diffusion model—the first such application in this domain. We design a test-time sparse-depth-guided gradient optimization mechanism that integrates RGB priors and sparse depth constraints without model fine-tuning. By back-propagating observation gradients during iterative denoising, our method achieves strong generalization and robustness. Experiments demonstrate significant improvements over both supervised and self-supervised state-of-the-art methods across diverse real-world scenarios, with particularly notable accuracy retention under ultra-sparse conditions.

Technology Category

Application Category

📝 Abstract
Depth completion upgrades sparse depth measurements into dense depth maps guided by a conventional image. Existing methods for this highly ill-posed task operate in tightly constrained settings and tend to struggle when applied to images outside the training domain or when the available depth measurements are sparse, irregularly distributed, or of varying density. Inspired by recent advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation guided by sparse measurements. Our method, Marigold-DC, builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance via an optimization scheme that runs in tandem with the iterative inference of denoising diffusion. The method exhibits excellent zero-shot generalization across a diverse range of environments and handles even extremely sparse guidance effectively. Our results suggest that contemporary monocular depth priors greatly robustify depth completion: it may be better to view the task as recovering dense depth from (dense) image pixels, guided by sparse depth; rather than as inpainting (sparse) depth, guided by an image. Project website: https://MarigoldDepthCompletion.github.io/
Problem

Research questions and friction points this paper is trying to address.

Zero-shot monocular depth completion with guided diffusion
Handles sparse irregular depth measurements effectively
Reframes depth completion as image-conditional generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot monocular depth completion
Guided diffusion with sparse measurements
Optimization scheme for denoising diffusion