🤖 AI Summary
To address the challenges of scarce annotations and poor cross-scenario generalization in few-shot endoscopic video segmentation, this paper proposes a robust semi-supervised framework. Methodologically, it introduces a dual-network cross-supervision architecture integrated with uncertainty-guided dynamic pseudo-label selection and multi-granularity mutual learning at both feature and image levels; notably, it pioneers a 3D CNN-based spatiotemporal calibration module to explicitly model inter-frame continuity. The core contributions are: (i) the first design of a quadruple collaborative semi-supervised mechanism tailored for endoscopic videos; and (ii) effective cross-clinical-scenario generalization—demonstrated on ureteroscopy lithotripsy and colonoscopy polyp datasets—using only 10% labeled data. On two real-world endoscopic benchmarks, the method achieves Dice scores of 92.3% (kidney stones) and 86.7% (polyps), significantly surpassing state-of-the-art methods. The code is publicly available.
📝 Abstract
In this paper, we present Endo-SemiS, a semi-supervised segmentation framework for providing reliable segmentation of endoscopic video frames with limited annotation. EndoSemiS uses 4 strategies to improve performance by effectively utilizing all available data, particularly unlabeled data: (1) Cross-supervision between two individual networks that supervise each other; (2) Uncertainty-guided pseudo-labels from unlabeled data, which are generated by selecting high-confidence regions to improve their quality; (3) Joint pseudolabel supervision, which aggregates reliable pixels from the pseudo-labels of both networks to provide accurate supervision for unlabeled data; and (4) Mutual learning, where both networks learn from each other at the feature and image levels, reducing variance and guiding them toward a consistent solution. Additionally, a separate corrective network that utilizes spatiotemporal information from endoscopy video to improve segmentation performance. Endo-SemiS is evaluated on two clinical applications: kidney stone laser lithotomy from ureteroscopy and polyp screening from colonoscopy. Compared to state-of-the-art segmentation methods, Endo-SemiS substantially achieves superior results on both datasets with limited labeled data. The code is publicly available at https://github.com/MedICL-VU/Endo-SemiS