๐ค AI Summary
Medical image segmentation heavily relies on large-scale annotated datasets, yet existing few-shot methods still require substantial training samples, hindering efficient clinical deployment. To address this, we propose a training-free few-shot 3D medical image segmentation framework: 3D volumetric data is reformulated as a video sequence, leveraging SAM2โs temporal modeling capability; only a single support image is neededโthrough data augmentation and frame-level dynamic matching, mask prompts are generated to directly drive SAM2 for query volume segmentation. Key contributions include: (i) the first training-free paradigm that eliminates fine-tuning; (ii) a support-query frame similarity-driven dynamic matching strategy; and (iii) the first reformulation of 3D medical segmentation as a video segmentation task. Our method achieves state-of-the-art performance on mainstream few-shot benchmarks, significantly improving both segmentation accuracy and annotation efficiency, while offering plug-and-play clinical applicability.
๐ Abstract
The reliance on large labeled datasets presents a significant challenge in medical image segmentation. Few-shot learning offers a potential solution, but existing methods often still require substantial training data. This paper proposes a novel approach that leverages the Segment Anything Model 2 (SAM2), a vision foundation model with strong video segmentation capabilities. We conceptualize 3D medical image volumes as video sequences, departing from the traditional slice-by-slice paradigm. Our core innovation is a support-query matching strategy: we perform extensive data augmentation on a single labeled support image and, for each frame in the query volume, algorithmically select the most analogous augmented support image. This selected image, along with its corresponding mask, is used as a mask prompt, driving SAM2's video segmentation. This approach entirely avoids model retraining or parameter updates. We demonstrate state-of-the-art performance on benchmark few-shot medical image segmentation datasets, achieving significant improvements in accuracy and annotation efficiency. This plug-and-play method offers a powerful and generalizable solution for 3D medical image segmentation.