Endo-TTAP: Robust Endoscopic Tissue Tracking via Multi-Facet Guided Attention and Hybrid Flow-point Supervision

📅 2025-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of long-term tissue point tracking in endoscopic videos—including complex non-rigid deformations, instrument occlusions, and scarcity of densely annotated trajectories—this paper proposes an unsupervised/weakly supervised learning framework. Methodologically, we introduce the Multi-Faceted Guided Attention (MFGA) mechanism, which jointly models multi-scale optical flow, DINOv2-derived semantic features, and explicit motion priors. We further design a two-stage curriculum learning strategy integrating uncertainty- and occlusion-aware regularization on synthetic data, semi-supervised pseudo-label distillation, optical flow consistency constraints, and an Auxiliary Curriculum Adapter (ACA) to enhance generalization. Evaluated on two MICCAI benchmarks and a newly constructed endoscopic dataset, our method achieves state-of-the-art performance, significantly improving tracking robustness and accuracy under challenging clinical scenarios.

Technology Category

Application Category

📝 Abstract
Accurate tissue point tracking in endoscopic videos is critical for robotic-assisted surgical navigation and scene understanding, but remains challenging due to complex deformations, instrument occlusion, and the scarcity of dense trajectory annotations. Existing methods struggle with long-term tracking under these conditions due to limited feature utilization and annotation dependence. We present Endo-TTAP, a novel framework addressing these challenges through: (1) A Multi-Facet Guided Attention (MFGA) module that synergizes multi-scale flow dynamics, DINOv2 semantic embeddings, and explicit motion patterns to jointly predict point positions with uncertainty and occlusion awareness; (2) A two-stage curriculum learning strategy employing an Auxiliary Curriculum Adapter (ACA) for progressive initialization and hybrid supervision. Stage I utilizes synthetic data with optical flow ground truth for uncertainty-occlusion regularization, while Stage II combines unsupervised flow consistency and semi-supervised learning with refined pseudo-labels from off-the-shelf trackers. Extensive validation on two MICCAI Challenge datasets and our collected dataset demonstrates that Endo-TTAP achieves state-of-the-art performance in tissue point tracking, particularly in scenarios characterized by complex endoscopic conditions. The source code and dataset will be available at https://anonymous.4open.science/r/Endo-TTAP-36E5.
Problem

Research questions and friction points this paper is trying to address.

Accurate tissue tracking in endoscopic videos under deformations and occlusions
Long-term tracking challenges due to limited feature utilization
Improving tracking performance in complex endoscopic conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Facet Guided Attention for robust tracking
Two-stage curriculum learning with hybrid supervision
Combines flow dynamics and semantic embeddings
🔎 Similar Papers
No similar papers found.
Rulin Zhou
Rulin Zhou
The Chinese University of Hong Kong Shenzhen Research Institute
Deep LearningMedical Image Processing
W
Wenlong He
College of Mechatronics and Engineering, Shenzhen University, Shenzhen, China
A
An Wang
Dept. of Electronic Engineering, Shun Hing Institute of Advanced Engineering (SHIAE), The Chinese University of Hong Kong, Hong Kong SAR, China
Q
Qiqi Yao
College of Mechatronics and Engineering, Shenzhen University, Shenzhen, China
H
Haijun Hu
Division of Gastrointestinal Surgery, Department of General Surgery, Shenzhen People’s Hospital, Shenzhen, China
Jiankun Wang
Jiankun Wang
Southern University of Science and Technology
RoboticsPath PlanningMotion ControlHuman-Robot Interaction
X
Xi Zhang
College of Mechatronics and Engineering, Shenzhen University, Shenzhen, China
Hongliang Ren
Hongliang Ren
Chinese University of Hong Kong | National University of Singapore | JHU/Harvard(RF) | CUHK(PhD)
Biorobotics & intelligent systemsmedical mechatronicscontinuumsoft flexible robots/sensorsmultisensory perception