Breaking the Data Barrier: Robust Few-Shot 3D Vessel Segmentation using Foundation Models

๐Ÿ“… 2026-02-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of domain shift in medical vascular segmentation, where existing methods suffer significant performance degradation under new imaging devices or protocols due to reliance on large annotated datasetsโ€”scarce in clinical practice. To overcome this, the study presents the first effective adaptation of the 2D vision foundation model DINOv3 to 3D vascular segmentation, introducing a lightweight 3D adapter, a multi-scale 3D feature aggregator, and Z-axis channel embedding to preserve vascular continuity and enhance cross-domain robustness. With only five training samples, the method achieves a Dice score of 43.42%, outperforming nnU-Net by 30%; on out-of-domain data, it attains a Dice of 21.37%, representing a 50% relative improvement and substantially surpassing baselines such as SwinUNETR.

Technology Category

Application Category

๐Ÿ“ Abstract
State-of-the-art vessel segmentation methods typically require large-scale annotated datasets and suffer from severe performance degradation under domain shifts. In clinical practice, however, acquiring extensive annotations for every new scanner or protocol is unfeasible. To address this, we propose a novel framework leveraging a pre-trained Vision Foundation Model (DINOv3) adapted for volumetric vessel segmentation. We introduce a lightweight 3D Adapter for volumetric consistency, a multi-scale 3D Aggregator for hierarchical feature fusion, and Z-channel embedding to effectively bridge the gap between 2D pre-training and 3D medical modalities, enabling the model to capture continuous vascular structures from limited data. We validated our method on the TopCoW (in-domain) and Lausanne (out-of-distribution) datasets. In the extreme few-shot regime with 5 training samples, our method achieved a Dice score of 43.42%, marking a 30% relative improvement over the state-of-the-art nnU-Net (33.41%) and outperforming other Transformer-based baselines, such as SwinUNETR and UNETR, by up to 45%. Furthermore, in the out-of-distribution setting, our model demonstrated superior robustness, achieving a 50% relative improvement over nnU-Net (21.37% vs. 14.22%), which suffered from severe domain overfitting. Ablation studies confirmed that our 3D adaptation mechanism and multi-scale aggregation strategy are critical for vascular continuity and robustness. Our results suggest foundation models offer a viable cold-start solution, improving clinical reliability under data scarcity or domain shifts.
Problem

Research questions and friction points this paper is trying to address.

few-shot learning
3D vessel segmentation
domain shift
data scarcity
medical image segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision Foundation Model
3D Adapter
Multi-scale 3D Aggregator
Z-channel Embedding
Few-shot 3D Segmentation
๐Ÿ”Ž Similar Papers
No similar papers found.
K
Kirato Yoshihara
The University of Osaka
Yohei Sugawara
Yohei Sugawara
Preferred Networks
Y
Yuta Tokuoka
Preferred Networks, Inc.
L
Lihang Hong
Preferred Networks, Inc.