Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the “flying pixels” artifact commonly observed in monocular depth estimation—spurious 3D points appearing at foreground-background boundaries due to the single-depth assumption. To resolve this, the authors introduce Mixture Density Aggregation (MDA), the first approach to apply mixture density modeling to this task. MDA predicts multiple depth hypotheses along with their associated probability distributions for each pixel and selects a plausible depth during decoding, effectively eliminating flying pixels. The method provides a unified treatment for challenging cases including object boundaries, multi-layered transparent objects, and unbounded sky regions. Compatible with various backbone architectures, MDA significantly improves boundary reconstruction quality with negligible computational overhead and demonstrates robustness even under severe input blur.

📝 Abstract

Despite advances in depth estimation, flying points remain a persistent failure mode: near object boundaries, depth estimators often predict spurious 3D points in the empty space between foreground and background surfaces. We trace this artifact to a standard modeling choice: assigning each pixel a single depth hypothesis. At boundaries, a pixel can straddle a foreground and a background surface, so its true depth is ambiguous between the two. A model that predicts a single depth cannot keep both possibilities, so training instead pulls the prediction toward an intermediate depth that lies on neither surface. We address this with MDA, a mixture-density representation that lets the model predict multiple depth hypotheses and their associated probabilities for each pixel. Near boundaries, different hypotheses can align with different surfaces, and the decoded depth is selected from one of these hypotheses rather than placed in the empty space between them. Across different backbones, MDA substantially improves boundary reconstruction and largely removes flying-point artifacts even under severe input blur, while adding negligible runtime overhead. The same mixture-density framework naturally extends to transparent objects, where it predicts multiple depth layers at transparent pixels, and to sky regions, where a dedicated component separates the unbounded sky from finite-depth regions, producing flying-point-free skylines. Project Page: https://biansy000.github.io/mda-site/.

Problem

Research questions and friction points this paper is trying to address.

depth estimation

flying points

depth ambiguity

object boundaries

mixture-density representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

mixture-density representation

depth ambiguity

flying-point-free depth estimation