Monocular Depth Guided Occlusion-Aware Disparity Refinement via Semi-supervised Learning in Laparoscopic Images

📅 2025-05-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenging disparity estimation problem in laparoscopic stereo images—characterized by severe occlusions and scarce annotated data—this paper proposes a Depth-Guided Occlusion-Aware Refinement Network (DGORNet). Methodologically, DGORNet introduces a novel monocular depth prior-guided occlusion modeling paradigm, leveraging depth features to suppress occlusion-induced artifacts; incorporates positional encoding to enhance spatial localization accuracy; and designs an optical flow difference loss (OFD-Loss) to enable semi-supervised spatiotemporal consistency optimization on unlabeled video sequences. On the SCARED dataset, DGORNet achieves significant improvements over state-of-the-art methods in both end-point error (EPE) and root-mean-square error (RMSE), particularly in occluded and textureless regions. Ablation studies confirm the critical contributions of positional encoding and OFD-Loss to robustness in dynamic surgical scenarios.

Technology Category

Application Category

📝 Abstract
Occlusion and the scarcity of labeled surgical data are significant challenges in disparity estimation for stereo laparoscopic images. To address these issues, this study proposes a Depth Guided Occlusion-Aware Disparity Refinement Network (DGORNet), which refines disparity maps by leveraging monocular depth information unaffected by occlusion. A Position Embedding (PE) module is introduced to provide explicit spatial context, enhancing the network's ability to localize and refine features. Furthermore, we introduce an Optical Flow Difference Loss (OFDLoss) for unlabeled data, leveraging temporal continuity across video frames to improve robustness in dynamic surgical scenes. Experiments on the SCARED dataset demonstrate that DGORNet outperforms state-of-the-art methods in terms of End-Point Error (EPE) and Root Mean Squared Error (RMSE), particularly in occlusion and texture-less regions. Ablation studies confirm the contributions of the Position Embedding and Optical Flow Difference Loss, highlighting their roles in improving spatial and temporal consistency. These results underscore DGORNet's effectiveness in enhancing disparity estimation for laparoscopic surgery, offering a practical solution to challenges in disparity estimation and data limitations.
Problem

Research questions and friction points this paper is trying to address.

Refining disparity maps in laparoscopic images using monocular depth
Addressing occlusion and limited labeled data in surgical disparity estimation
Enhancing spatial and temporal consistency in dynamic surgical scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monocular depth guides occlusion-aware disparity refinement
Position Embedding module enhances spatial context
Optical Flow Difference Loss improves temporal consistency
Ziteng Liu
Ziteng Liu
Vanderbilt University
Dongdong He
Dongdong He
The Chinese University of Hong Kong, Shenzhen
Modeling and Scientific ComputingFluid MechanicsNumerical Methods for PDEs
C
Chenghong Zhang
School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
W
Wenpeng Gao
School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
Y
Yili Fu
State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, China