🤖 AI Summary
In medical image segmentation, Transformer-based models suffer from inadequate local detail modeling, high computational overhead, and low feature-map entropy—leading to spatially inaccurate attention. To address these issues, we propose a coupled framework integrating Attention-driven Cross-Scale Graph Neural Networks (ACS-GNN) and Entropy-guided Feature Selection (EFS). ACS-GNN constructs cross-scale graph-structured skip connections to jointly model fine-grained anatomical structures and global semantic dependencies; EFS dynamically selects high-information features via an entropy-driven mechanism to refine spatial attention quality. Evaluated on six seen and eight unseen datasets, our method achieves state-of-the-art segmentation accuracy while significantly reducing computational cost compared to mainstream Transformers. Moreover, it demonstrates markedly improved domain generalization and feature robustness.
📝 Abstract
Skip connection engineering is primarily employed to address the semantic gap between the encoder and decoder, while also integrating global dependencies to understand the relationships among complex anatomical structures in medical image segmentation. Although several models have proposed transformer-based approaches to incorporate global dependencies within skip connections, they often face limitations in capturing detailed local features with high computational complexity. In contrast, graph neural networks (GNNs) exploit graph structures to effectively capture local and global features. Leveraging these properties, we introduce an attentional cross-scale graph neural network (ACS-GNN), which enhances the skip connection framework by converting cross-scale feature maps into a graph structure and capturing complex anatomical structures through node attention. Additionally, we observed that deep learning models often produce uninformative feature maps, which degrades the quality of spatial attention maps. To address this problem, we integrated entropy-driven feature selection (EFS) with spatial attention, calculating an entropy score for each channel and filtering out high-entropy feature maps. Our innovative framework, TransGUNet, comprises ACS-GNN and EFS-based spatial attentio} to effectively enhance domain generalizability across various modalities by leveraging GNNs alongside a reliable spatial attention map, ensuring more robust features within the skip connection. Through comprehensive experiments and analysis, TransGUNet achieved superior segmentation performance on six seen and eight unseen datasets, demonstrating significantly higher efficiency compared to previous methods.