MetaFormer-driven Encoding Network for Robust Medical Semantic Segmentation

📅 2026-01-01

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work proposes MFEnNet, an efficient medical image segmentation framework designed for deployment in resource-constrained clinical settings where existing high-performance models are often impractical due to excessive computational demands. By integrating a MetaFormer architecture into the U-Net encoder, MFEnNet replaces conventional self-attention mechanisms with pooling-based Transformer blocks to model global context while significantly reducing computational overhead. The framework further enhances multi-scale feature representation through the incorporation of Swish activation functions and spatial pyramid pooling. Experimental results demonstrate that MFEnNet achieves segmentation accuracy comparable to state-of-the-art methods across multiple medical imaging benchmarks, while substantially improving inference efficiency.

Technology Category

Application Category

📝 Abstract

Semantic segmentation is crucial for medical image analysis, enabling precise disease diagnosis and treatment planning. However, many advanced models employ complex architectures, limiting their use in resource-constrained clinical settings. This paper proposes MFEnNet, an efficient medical image segmentation framework that incorporates MetaFormer in the encoding phase of the U-Net backbone. MetaFormer, an architectural abstraction of vision transformers, provides a versatile alternative to convolutional neural networks by transforming tokenized image patches into sequences for global context modeling. To mitigate the substantial computational cost associated with self-attention, the proposed framework replaces conventional transformer modules with pooling transformer blocks, thereby achieving effective global feature aggregation at reduced complexity. In addition, Swish activation is used to achieve smoother gradients and faster convergence, while spatial pyramid pooling is incorporated at the bottleneck to improve multi-scale feature extraction. Comprehensive experiments on different medical segmentation benchmarks demonstrate that the proposed MFEnNet approach attains competitive accuracy while significantly lowering computational cost compared to state-of-the-art models. The source code for this work is available at https://github.com/tranleanh/mfennet.

Problem

Research questions and friction points this paper is trying to address.

medical semantic segmentation

computational efficiency

resource-constrained settings

model complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

MetaFormer

Pooling Transformer

Medical Semantic Segmentation

Efficient Architecture