Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation

📅 2025-03-17

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the challenges of boundary ambiguity and insufficient robustness in 3D semantic segmentation of spinal medical images, arising from anatomical complexity. To this end, we propose an enhanced SwinUNETR architecture. Methodologically, we introduce an adaptive Transformer attention mechanism and a dynamic multi-scale feature fusion strategy to enable precise localization of critical anatomical regions and effective cross-scale contextual modeling, all within an end-to-end 3D semantic segmentation framework. Evaluated on a public spinal dataset, our model achieves significant improvements over baseline methods—including 3D CNN, 3D U-Net, and 3D U-Net+Transformer—on key metrics: mean Intersection-over-Union (mIoU), mean Dice coefficient (mDice), and mean accuracy (mAcc). Qualitative visualization further demonstrates superior reconstruction fidelity for fine-grained structures such as vertebral bodies and pedicles, closely aligning with ground-truth anatomical morphology.

Technology Category

Application Category

📝 Abstract

This study proposes a 3D semantic segmentation method for the spine based on the improved SwinUNETR to improve segmentation accuracy and robustness. Aiming at the complex anatomical structure of spinal images, this paper introduces a multi-scale fusion mechanism to enhance the feature extraction capability by using information of different scales, thereby improving the recognition accuracy of the model for the target area. In addition, the introduction of the adaptive attention mechanism enables the model to dynamically adjust the attention to the key area, thereby optimizing the boundary segmentation effect. The experimental results show that compared with 3D CNN, 3D U-Net, and 3D U-Net + Transformer, the model of this study has achieved significant improvements in mIoU, mDice, and mAcc indicators, and has better segmentation performance. The ablation experiment further verifies the effectiveness of the proposed improved method, proving that multi-scale fusion and adaptive attention mechanism have a positive effect on the segmentation task. Through the visualization analysis of the inference results, the model can better restore the real anatomical structure of the spinal image. Future research can further optimize the Transformer structure and expand the data scale to improve the generalization ability of the model. This study provides an efficient solution for the task of medical image segmentation, which is of great significance to intelligent medical image analysis.

Problem

Research questions and friction points this paper is trying to address.

Improves 3D spine segmentation accuracy and robustness

Enhances feature extraction via multi-scale fusion mechanism

Optimizes boundary segmentation with adaptive attention mechanism

Innovation

Methods, ideas, or system contributions that make the work stand out.

Improved SwinUNETR for 3D spine segmentation

Multi-scale fusion enhances feature extraction

Adaptive attention optimizes boundary segmentation

🔎 Similar Papers

No similar papers found.