Depth Estimation Based on 3D Gaussian Splatting Siamese Defocus

๐Ÿ“… 2024-09-18
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Addressing monocular depth estimation without all-in-focus (AIF) image guidance, this paper proposes a self-supervised framework that takes a single defocused image as input and jointly predicts a defocus map and the Circle of Confusion (CoC) to regress scene depth. Our key innovation lies in integrating 3D Gaussian Splatting into defocus image rendering, enabling differentiable, self-supervised signal generationโ€”thus eliminating reliance on AIF reference images or ground-truth depth labels. A siamese network architecture facilitates end-to-end joint optimization of defocus and depth estimation. Extensive experiments on both synthetic and real-world defocused datasets demonstrate that our method significantly outperforms conventional defocus-based depth estimation (DFD) approaches, achieving state-of-the-art quantitative accuracy and visual quality in depth prediction.

Technology Category

Application Category

๐Ÿ“ Abstract
Depth estimation is a fundamental task in 3D geometry. While stereo depth estimation can be achieved through triangulation methods, it is not as straightforward for monocular methods, which require the integration of global and local information. The Depth from Defocus (DFD) method utilizes camera lens models and parameters to recover depth information from blurred images and has been proven to perform well. However, these methods rely on All-In-Focus (AIF) images for depth estimation, which is nearly impossible to obtain in real-world applications. To address this issue, we propose a self-supervised framework based on 3D Gaussian splatting and Siamese networks. By learning the blur levels at different focal distances of the same scene in the focal stack, the framework predicts the defocus map and Circle of Confusion (CoC) from a single defocused image, using the defocus map as input to DepthNet for monocular depth estimation. The 3D Gaussian splatting model renders defocused images using the predicted CoC, and the differences between these and the real defocused images provide additional supervision signals for the Siamese Defocus self-supervised network. This framework has been validated on both artificially synthesized and real blurred datasets. Subsequent quantitative and visualization experiments demonstrate that our proposed framework is highly effective as a DFD method.
Problem

Research questions and friction points this paper is trying to address.

Monocular depth estimation without All-In-Focus images
Self-supervised defocus map prediction from blurred images
3D Gaussian splatting for depth-aware blur rendering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised 3D Gaussian splatting for depth estimation
Siamese network predicts defocus map and CoC
Monocular depth estimation from single defocused image
๐Ÿ”Ž Similar Papers
No similar papers found.
J
Jinchang Zhang
University of Georgia
N
Ningning Xu
University of Georgia
H
Hao Zhang
University of Massachusetts Amherst
Guoyu Lu
Guoyu Lu
SUNY Binghamton
RoboticsComputer VisionMachine Learning