Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video

📅 2025-12-18

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

To address the challenges of scarce annotations and poor cross-scenario generalization in few-shot endoscopic video segmentation, this paper proposes a robust semi-supervised framework. Methodologically, it introduces a dual-network cross-supervision architecture integrated with uncertainty-guided dynamic pseudo-label selection and multi-granularity mutual learning at both feature and image levels; notably, it pioneers a 3D CNN-based spatiotemporal calibration module to explicitly model inter-frame continuity. The core contributions are: (i) the first design of a quadruple collaborative semi-supervised mechanism tailored for endoscopic videos; and (ii) effective cross-clinical-scenario generalization—demonstrated on ureteroscopy lithotripsy and colonoscopy polyp datasets—using only 10% labeled data. On two real-world endoscopic benchmarks, the method achieves Dice scores of 92.3% (kidney stones) and 86.7% (polyps), significantly surpassing state-of-the-art methods. The code is publicly available.

Technology Category

Application Category

📝 Abstract

In this paper, we present Endo-SemiS, a semi-supervised segmentation framework for providing reliable segmentation of endoscopic video frames with limited annotation. EndoSemiS uses 4 strategies to improve performance by effectively utilizing all available data, particularly unlabeled data: (1) Cross-supervision between two individual networks that supervise each other; (2) Uncertainty-guided pseudo-labels from unlabeled data, which are generated by selecting high-confidence regions to improve their quality; (3) Joint pseudolabel supervision, which aggregates reliable pixels from the pseudo-labels of both networks to provide accurate supervision for unlabeled data; and (4) Mutual learning, where both networks learn from each other at the feature and image levels, reducing variance and guiding them toward a consistent solution. Additionally, a separate corrective network that utilizes spatiotemporal information from endoscopy video to improve segmentation performance. Endo-SemiS is evaluated on two clinical applications: kidney stone laser lithotomy from ureteroscopy and polyp screening from colonoscopy. Compared to state-of-the-art segmentation methods, Endo-SemiS substantially achieves superior results on both datasets with limited labeled data. The code is publicly available at https://github.com/MedICL-VU/Endo-SemiS

Problem

Research questions and friction points this paper is trying to address.

Robust semi-supervised segmentation for endoscopic video frames

Utilizing limited annotations and unlabeled data effectively

Improving segmentation accuracy in clinical endoscopic applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-supervision between two networks for mutual guidance

Uncertainty-guided pseudo-labels from high-confidence unlabeled data

Spatiotemporal corrective network using endoscopic video information

🔎 Similar Papers

No similar papers found.