🤖 AI Summary
Depth completion aims to reconstruct dense, accurate depth maps from sparse depth measurements and corresponding RGB images. Existing single-scale propagation methods suffer from low computational efficiency and limited capacity for modeling scene context. To address these limitations, we propose LP-Net, the first framework introducing an inverse Laplacian pyramid paradigm for progressive depth prediction: it first captures global structural priors and then incrementally refines high-frequency details across pyramid levels. We design a multi-path feature pyramid module to enhance multi-scale contextual awareness and introduce a selective depth filtering module that enables dynamic, learnable smoothing or sharpening—balancing edge fidelity and noise robustness. LP-Net achieves state-of-the-art performance on KITTI (ranked #1 on the official benchmark), NYUv2, and TOFDC, delivering significant improvements in both accuracy and inference efficiency.
📝 Abstract
Depth completion endeavors to reconstruct a dense depth map from sparse depth measurements, leveraging the information provided by a corresponding color image. Existing approaches mostly hinge on single-scale propagation strategies that iteratively ameliorate initial coarse depth estimates through pixel-level message passing. Despite their commendable outcomes, these techniques are frequently hampered by computational inefficiencies and a limited grasp of scene context. To circumvent these challenges, we introduce LP-Net, an innovative framework that implements a multi-scale, progressive prediction paradigm based on Laplacian Pyramid decomposition. Diverging from propagation-based approaches, LP-Net initiates with a rudimentary, low-resolution depth prediction to encapsulate the global scene context, subsequently refining this through successive upsampling and the reinstatement of high-frequency details at incremental scales. We have developed two novel modules to bolster this strategy: 1) the Multi-path Feature Pyramid module, which segregates feature maps into discrete pathways, employing multi-scale transformations to amalgamate comprehensive spatial information, and 2) the Selective Depth Filtering module, which dynamically learns to apply both smoothness and sharpness filters to judiciously mitigate noise while accentuating intricate details. By integrating these advancements, LP-Net not only secures state-of-the-art (SOTA) performance across both outdoor and indoor benchmarks such as KITTI, NYUv2, and TOFDC, but also demonstrates superior computational efficiency. At the time of submission, LP-Net ranks 1st among all peer-reviewed methods on the official KITTI leaderboard.