🤖 AI Summary
To address the challenges of label scarcity and temporal dynamics modeling in longitudinal MRI segmentation for prostate cancer active surveillance, this paper proposes a 3D semi-supervised segmentation framework tailored for sparsely annotated longitudinal data. Methodologically, it introduces a Mamba-enhanced cross-attention module to capture long-range inter-temporal dependencies, incorporates a shape-aware decoder to explicitly model prostate morphological evolution, and employs nnU-Net–generated high-quality pseudo-labels for self-training. The key contributions are: (i) the first integration of the state-space model Mamba into longitudinal medical image segmentation, enabling efficient joint spatiotemporal modeling; and (ii) superior performance over U-Net and Transformer baselines under sparse and noisy labeling conditions—achieving a Dice score improvement of over 3.2% on a public longitudinal prostate MRI dataset—demonstrating robustness and clinical applicability.
📝 Abstract
Active Surveillance (AS) is a treatment option for managing low and intermediate-risk prostate cancer (PCa), aiming to avoid overtreatment while monitoring disease progression through serial MRI and clinical follow-up. Accurate prostate segmentation is an important preliminary step for automating this process, enabling automated detection and diagnosis of PCa. However, existing deep-learning segmentation models are often trained on single-time-point and expertly annotated datasets, making them unsuitable for longitudinal AS analysis, where multiple time points and a scarcity of expert labels hinder their effective fine-tuning. To address these challenges, we propose MambaX-Net, a novel semi-supervised, dual-scan 3D segmentation architecture that computes the segmentation for time point t by leveraging the MRI and the corresponding segmentation mask from the previous time point. We introduce two new components: (i) a Mamba-enhanced Cross-Attention Module, which integrates the Mamba block into cross attention to efficiently capture temporal evolution and long-range spatial dependencies, and (ii) a Shape Extractor Module that encodes the previous segmentation mask into a latent anatomical representation for refined zone delination. Moreover, we introduce a semi-supervised self-training strategy that leverages pseudo-labels generated from a pre-trained nnU-Net, enabling effective learning without expert annotations. MambaX-Net was evaluated on a longitudinal AS dataset, and results showed that it significantly outperforms state-of-the-art U-Net and Transformer-based models, achieving superior prostate zone segmentation even when trained on limited and noisy data.