MaskMed: Decoupled Mask and Class Prediction for Medical Image Segmentation

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In medical image segmentation, conventional point-wise convolutional heads rigidly bind output channels to specific classes, hindering feature sharing and semantic generalization. To address this, we propose a unified decoupled segmentation head that decomposes multi-class segmentation into class-agnostic mask generation and independent category classification, augmented by shared object queries to enable cross-class feature reuse. Furthermore, we design a full-scale-aware deformable Transformer that efficiently fuses full-resolution features using deformable attention guided by low-resolution features. Our method achieves state-of-the-art performance, improving Dice scores by 2.0% on AMOS 2022 and 6.9% on BTCV over nnUNet. The core contributions lie in the synergistic innovation of (1) a decoupled architectural design, (2) a shared object query mechanism for enhanced feature generalization, and (3) multi-scale deformable attention for robust hierarchical feature integration.

Technology Category

Application Category

📝 Abstract
Medical image segmentation typically adopts a point-wise convolutional segmentation head to predict dense labels, where each output channel is heuristically tied to a specific class. This rigid design limits both feature sharing and semantic generalization. In this work, we propose a unified decoupled segmentation head that separates multi-class prediction into class-agnostic mask prediction and class label prediction using shared object queries. Furthermore, we introduce a Full-Scale Aware Deformable Transformer module that enables low-resolution encoder features to attend across full-resolution encoder features via deformable attention, achieving memory-efficient and spatially aligned full-scale fusion. Our proposed method, named MaskMed, achieves state-of-the-art performance, surpassing nnUNet by +2.0% Dice on AMOS 2022 and +6.9% Dice on BTCV.
Problem

Research questions and friction points this paper is trying to address.

Decouples mask prediction from class prediction
Enables feature sharing and semantic generalization
Achieves memory-efficient full-scale feature fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled mask and class prediction using shared queries
Full-scale deformable attention for feature fusion
Memory-efficient transformer module for medical segmentation
🔎 Similar Papers
No similar papers found.
Bin Xie
Bin Xie
InfoBeyond Technology LLC
Mobile ComuptingSecurityBig Data Streaming
G
G. Agam
Department of Computer Science, Illinois Institute of Technology, USA