Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video

📅 2025-12-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of scarce annotations and poor cross-scenario generalization in few-shot endoscopic video segmentation, this paper proposes a robust semi-supervised framework. Methodologically, it introduces a dual-network cross-supervision architecture integrated with uncertainty-guided dynamic pseudo-label selection and multi-granularity mutual learning at both feature and image levels; notably, it pioneers a 3D CNN-based spatiotemporal calibration module to explicitly model inter-frame continuity. The core contributions are: (i) the first design of a quadruple collaborative semi-supervised mechanism tailored for endoscopic videos; and (ii) effective cross-clinical-scenario generalization—demonstrated on ureteroscopy lithotripsy and colonoscopy polyp datasets—using only 10% labeled data. On two real-world endoscopic benchmarks, the method achieves Dice scores of 92.3% (kidney stones) and 86.7% (polyps), significantly surpassing state-of-the-art methods. The code is publicly available.

Technology Category

Application Category

📝 Abstract
In this paper, we present Endo-SemiS, a semi-supervised segmentation framework for providing reliable segmentation of endoscopic video frames with limited annotation. EndoSemiS uses 4 strategies to improve performance by effectively utilizing all available data, particularly unlabeled data: (1) Cross-supervision between two individual networks that supervise each other; (2) Uncertainty-guided pseudo-labels from unlabeled data, which are generated by selecting high-confidence regions to improve their quality; (3) Joint pseudolabel supervision, which aggregates reliable pixels from the pseudo-labels of both networks to provide accurate supervision for unlabeled data; and (4) Mutual learning, where both networks learn from each other at the feature and image levels, reducing variance and guiding them toward a consistent solution. Additionally, a separate corrective network that utilizes spatiotemporal information from endoscopy video to improve segmentation performance. Endo-SemiS is evaluated on two clinical applications: kidney stone laser lithotomy from ureteroscopy and polyp screening from colonoscopy. Compared to state-of-the-art segmentation methods, Endo-SemiS substantially achieves superior results on both datasets with limited labeled data. The code is publicly available at https://github.com/MedICL-VU/Endo-SemiS
Problem

Research questions and friction points this paper is trying to address.

Robust semi-supervised segmentation for endoscopic video frames
Utilizing limited annotations and unlabeled data effectively
Improving segmentation accuracy in clinical endoscopic applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-supervision between two networks for mutual guidance
Uncertainty-guided pseudo-labels from high-confidence unlabeled data
Spatiotemporal corrective network using endoscopic video information
🔎 Similar Papers
No similar papers found.
H
Hao Li
Vanderbilt University
D
Daiwei Lu
Vanderbilt University
X
Xing Yao
Vanderbilt University
N
Nicholas Kavoussi
Vanderbilt University Medical Center
Ipek Oguz
Ipek Oguz
Vanderbilt University
Medical image computingmedical image analysissegmentationimage registrationrodent imaging