You Are Your Best Teacher: Semi-Supervised Surgical Point Tracking with Cycle-Consistent Self-Distillation

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

To address point tracking failure in surgical videos caused by domain shift and scarce annotations, this paper proposes SurgTracker, a semi-supervised adaptive framework. SurgTracker employs an isomorphic teacher-student architecture for online self-distillation and introduces a novel cycle-consistency-based pseudo-label filtering mechanism—leveraging geometric and temporal consistency of point trajectories instead of multi-teacher ensembles, thereby significantly improving pseudo-label quality and supervision stability at zero additional computational cost. By bridging the synthetic-to-real domain gap and explicitly modeling point trajectories, SurgTracker achieves state-of-the-art performance on the STIR benchmark using only 80 unlabeled surgical videos. Experimental results demonstrate its effectiveness and practicality in high-domain-shift, low-resource medical scenarios.

Technology Category

Application Category

📝 Abstract

Synthetic datasets have enabled significant progress in point tracking by providing large-scale, densely annotated supervision. However, deploying these models in real-world domains remains challenging due to domain shift and lack of labeled data-issues that are especially severe in surgical videos, where scenes exhibit complex tissue deformation, occlusion, and lighting variation. While recent approaches adapt synthetic-trained trackers to natural videos using teacher ensembles or augmentation-heavy pseudo-labeling pipelines, their effectiveness in high-shift domains like surgery remains unexplored. This work presents SurgTracker, a semi-supervised framework for adapting synthetic-trained point trackers to surgical video using filtered self-distillation. Pseudo-labels are generated online by a fixed teacher-identical in architecture and initialization to the student-and are filtered using a cycle consistency constraint to discard temporally inconsistent trajectories. This simple yet effective design enforces geometric consistency and provides stable supervision throughout training, without the computational overhead of maintaining multiple teachers. Experiments on the STIR benchmark show that SurgTracker improves tracking performance using only 80 unlabeled videos, demonstrating its potential for robust adaptation in high-shift, data-scarce domains.

Problem

Research questions and friction points this paper is trying to address.

Adapting synthetic-trained point trackers to surgical videos

Addressing domain shift and lack of labeled surgical data

Ensuring geometric consistency in semi-supervised point tracking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-supervised surgical point tracking

Cycle-consistent self-distillation

Single teacher-student architecture

🔎 Similar Papers

CycleSAM: One-Shot Surgical Scene Segmentation using Cycle-Consistent Feature Matching to Prompt SAM