🤖 AI Summary
Current automatic polyp counting methods in colonoscopy rely solely on visual appearance modeling, neglecting temporal trajectory structure—leading to fragmented clustering and inaccurate counts. To address this, we propose a Temporal-Supervised Contrastive Learning (TSC-CL) framework that explicitly models both intra-trajectory appearance variation and temporal continuity via a novel supervised contrastive loss with temporal adjacency constraints. Our method jointly optimizes trajectory segment representation learning, temporal-aware clustering, and end-to-end training. Evaluated on public benchmarks using leave-one-sequence-out cross-validation, TSC-CL reduces fragmentation rate by 2.2× compared to state-of-the-art approaches and achieves superior polyp counting accuracy across multiple metrics. These results demonstrate a significant breakthrough beyond appearance-only paradigms, establishing new performance boundaries for video-based polyp quantification.
📝 Abstract
Automated polyp counting in colonoscopy is a crucial step toward automated procedure reporting and quality control, aiming to enhance the cost-effectiveness of colonoscopy screening. Counting polyps in a procedure involves detecting and tracking polyps, and then clustering tracklets that belong to the same polyp entity. Existing methods for polyp counting rely on self-supervised learning and primarily leverage visual appearance, neglecting temporal relationships in both tracklet feature learning and clustering stages. In this work, we introduce a paradigm shift by proposing a supervised contrastive loss that incorporates temporally-aware soft targets. Our approach captures intra-polyp variability while preserving inter-polyp discriminability, leading to more robust clustering. Additionally, we improve tracklet clustering by integrating a temporal adjacency constraint, reducing false positive re-associations between visually similar but temporally distant tracklets. We train and validate our method on publicly available datasets and evaluate its performance with a leave-one-out cross-validation strategy. Results demonstrate a 2.2x reduction in fragmentation rate compared to prior approaches. Our results highlight the importance of temporal awareness in polyp counting, establishing a new state-of-the-art. Code is available at https://github.com/lparolari/temporally-aware-polyp-counting.