GLoG-CSUnet: Enhancing Vision Transformers with Adaptable Radiomic Features for Medical Image Segmentation

📅 2025-01-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the limited capability of Vision Transformers (ViTs) in modeling fine-grained anatomical details and their poor generalization under limited training data in medical image segmentation, this paper proposes a novel Swin Transformer architecture integrated with learnable radiomics-inspired priors. Methodologically, we embed dynamic adaptive Gabor and Laplacian-of-Gaussian (LoG) filters into the encoder to jointly capture long-range dependencies and multi-scale texture/edge features. Additionally, we introduce a radiomics-motivated feature enhancement module designed with lightweight parameterization. Evaluated on Synapse and ACDC benchmarks, our method achieves Dice score improvements of 1.14% and 0.99%, respectively, while introducing only 15K and 30K additional trainable parameters—substantially outperforming state-of-the-art approaches. The design effectively bridges low-level imaging priors with hierarchical transformer representations, enhancing both anatomical fidelity and data efficiency in medical segmentation.

Technology Category

Application Category

📝 Abstract

Vision Transformers (ViTs) have shown promise in medical image semantic segmentation (MISS) by capturing long-range correlations. However, ViTs often struggle to model local spatial information effectively, which is essential for accurately segmenting fine anatomical details, particularly when applied to small datasets without extensive pre-training. We introduce Gabor and Laplacian of Gaussian Convolutional Swin Network (GLoG-CSUnet), a novel architecture enhancing Transformer-based models by incorporating learnable radiomic features. This approach integrates dynamically adaptive Gabor and Laplacian of Gaussian (LoG) filters to capture texture, edge, and boundary information, enhancing the feature representation processed by the Transformer model. Our method uniquely combines the long-range dependency modeling of Transformers with the texture analysis capabilities of Gabor and LoG features. Evaluated on the Synapse multi-organ and ACDC cardiac segmentation datasets, GLoG-CSUnet demonstrates significant improvements over state-of-the-art models, achieving a 1.14% increase in Dice score for Synapse and 0.99% for ACDC, with minimal computational overhead (only 15 and 30 additional parameters, respectively). GLoG-CSUnet's flexible design allows integration with various base models, offering a promising approach for incorporating radiomics-inspired feature extraction in Transformer architectures for medical image analysis. The code implementation is available on GitHub at: https://github.com/HAAIL/GLoG-CSUnet.

Problem

Research questions and friction points this paper is trying to address.

ViTs

small image detail

anatomical structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

GLoG-CSUnet

Swin Transformer

Medical Image Analysis

🔎 Similar Papers

No similar papers found.

Authors to Follow