Multi-Scale Feature Fusion with Image-Driven Spatial Integration for Left Atrium Segmentation from Cardiac MRI Images

📅 2025-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current automatic left atrium (LA) segmentation from cardiac MRI—critical for atrial fibrillation ablation planning—suffers from insufficient accuracy due to labor-intensive manual annotation, high inter-observer variability, and spatial detail loss in foundational vision models caused by downsampling. To address this, we propose a medical-domain-adapted multi-scale feature fusion architecture. Our key contributions are: (1) a learnable multi-scale feature weighting and fusion mechanism; and (2) the first integration of the original high-resolution input image into the decoder via dynamic injection, enabling precise spatial structure reconstruction. The model adopts DINOv2 as the encoder and UNet as the decoder backbone. Evaluated on the LAScarQS 2022 dataset, it achieves 92.3% Dice score and 84.1% IoU—significantly outperforming the nnUNet baseline—and establishes a new paradigm for clinically reliable LA segmentation.

Technology Category

Application Category

📝 Abstract
Accurate segmentation of the left atrium (LA) from late gadolinium-enhanced magnetic resonance imaging plays a vital role in visualizing diseased atrial structures, enabling the diagnosis and management of cardiovascular diseases. It is particularly essential for planning treatment with ablation therapy, a key intervention for atrial fibrillation (AF). However, manual segmentation is time-intensive and prone to inter-observer variability, underscoring the need for automated solutions. Class-agnostic foundation models like DINOv2 have demonstrated remarkable feature extraction capabilities in vision tasks. However, their lack of domain specificity and task-specific adaptation can reduce spatial resolution during feature extraction, impacting the capture of fine anatomical detail in medical imaging. To address this limitation, we propose a segmentation framework that integrates DINOv2 as an encoder with a UNet-style decoder, incorporating multi-scale feature fusion and input image integration to enhance segmentation accuracy. The learnable weighting mechanism dynamically prioritizes hierarchical features from different encoder blocks of the foundation model, optimizing feature selection for task relevance. Additionally, the input image is reintroduced during the decoding stage to preserve high-resolution spatial details, addressing limitations of downsampling in the encoder. We validate our approach on the LAScarQS 2022 dataset and demonstrate improved performance with a 92.3% Dice and 84.1% IoU score for giant architecture compared to the nnUNet baseline model. These findings emphasize the efficacy of our approach in advancing the field of automated left atrium segmentation from cardiac MRI.
Problem

Research questions and friction points this paper is trying to address.

Automated left atrium segmentation
Multi-scale feature fusion
Enhanced spatial detail preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale feature fusion
Image-driven spatial integration
DINOv2-UNet framework
🔎 Similar Papers
No similar papers found.
Bipasha Kundu
Bipasha Kundu
Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA
Zixin Yang
Zixin Yang
Rochester Institute of Technology
Image-guided Surgery
R
Richard Simon
Department of Biomedical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA
C
Cristian Linte
Department of Biomedical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA