An Enhanced Pyramid Feature Network Based on Long-Range Dependencies for Multi-Organ Medical Image Segmentation

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the high computational cost and insufficient local detail modeling of Transformers in multi-organ medical image segmentation, this paper proposes LamFormer—a novel U-shaped architecture. Methodologically, it integrates three key innovations: (1) a linear-attention Mamba module to drastically reduce computational complexity for long-range dependency modeling; (2) an enhanced pyramid encoder coupled with a Parallel Hierarchical Feature Aggregation (PHFA) module to bridge semantic gaps across multi-scale features; and (3) a lightweight Reduced Transformer to jointly enhance local detail representation and global contextual awareness. Evaluated on seven mainstream medical imaging benchmarks, LamFormer consistently outperforms existing state-of-the-art methods—achieving superior segmentation accuracy while requiring significantly fewer parameters and lower FLOPs. It thus establishes a new Pareto-optimal trade-off between precision and efficiency in medical image segmentation.

Technology Category

Application Category

📝 Abstract

In the field of multi-organ medical image segmentation, recent methods frequently employ Transformers to capture long-range dependencies from image features. However, these methods overlook the high computational cost of Transformers and their deficiencies in extracting local detailed information. To address high computational costs and inadequate local detail information, we reassess the design of feature extraction modules and propose a new deep-learning network called LamFormer for fine-grained segmentation tasks across multiple organs. LamFormer is a novel U-shaped network that employs Linear Attention Mamba (LAM) in an enhanced pyramid encoder to capture multi-scale long-range dependencies. We construct the Parallel Hierarchical Feature Aggregation (PHFA) module to aggregate features from different layers of the encoder, narrowing the semantic gap among features while filtering information. Finally, we design the Reduced Transformer (RT), which utilizes a distinct computational approach to globally model up-sampled features. RRT enhances the extraction of detailed local information and improves the network's capability to capture long-range dependencies. LamFormer outperforms existing segmentation methods on seven complex and diverse datasets, demonstrating exceptional performance. Moreover, the proposed network achieves a balance between model performance and model complexity.

Problem

Research questions and friction points this paper is trying to address.

Addresses high computational costs in medical image segmentation

Improves extraction of local detailed information from images

Captures multi-scale long-range dependencies for organ segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

LamFormer uses Linear Attention Mamba for dependencies

PHFA module aggregates multi-layer features to reduce gaps

Reduced Transformer globally models up-sampled features efficiently

🔎 Similar Papers

No similar papers found.

Authors to Follow