🤖 AI Summary
In ultrasound-guided fine-needle aspiration (FNA) biopsy, rapid needle reciprocation causes severe tracking failure due to transient image degradation. To address this, we propose a registration-based temporal modeling framework built upon the Mamba architecture. Our key contributions are: (1) the first Mamba-driven temporal register mechanism, which caches and retrieves global historical context to compensate for instantaneous imaging artifacts; and (2) a self-supervised register diversity loss that mitigates feature collapse and ensures dimensional independence and expressive diversity of temporal prompts. The method integrates Mamba’s efficient sequence modeling, a register bank, contrastive learning, and robust ultrasound feature matching. Evaluated on both motor-driven and manual FNA datasets, our approach achieves a 12.6% improvement in localization accuracy, a 23.4% gain in robustness, and real-time inference speed (>30 FPS), significantly outperforming existing state-of-the-art methods.
📝 Abstract
Ultrasound-guided fine needle aspiration (FNA) biopsy is a common minimally invasive diagnostic procedure. However, an aspiration needle tracker addressing rapid reciprocating motion is still missing. MrTrack, an aspiration needle tracker with a mamba-based register mechanism, is proposed. MrTrack leverages a Mamba-based register extractor to sequentially distill global context from each historical search map, storing these temporal cues in a register bank. The Mamba-based register retriever then retrieves temporal prompts from the register bank to provide external cues when current vision features are temporarily unusable due to rapid reciprocating motion and imaging degradation. A self-supervised register diversify loss is proposed to encourage feature diversity and dimension independence within the learned register, mitigating feature collapse. Comprehensive experiments conducted on both motorized and manual aspiration datasets demonstrate that MrTrack not only outperforms state-of-the-art trackers in accuracy and robustness but also achieves superior inference efficiency.