🤖 AI Summary
Existing 3D medical image segmentation methods struggle with high computational costs, unidirectional scanning bias, and insufficient directional aggregation when modeling long-range dependencies. This work proposes BiSegMamba, which introduces a novel bidirectional tri-directional Ortho Mamba module coupled with an adaptive directional fusion mechanism to effectively mitigate scan-order bias and enable input-aware integration of multi-directional contextual information. By integrating a progressively compressed backbone with a multi-scale spatial mixer, the method efficiently captures long-range dependencies across multiple views while preserving high-resolution details. Extensive experiments demonstrate that BiSegMamba achieves comparable or superior segmentation accuracy to SegMamba-V2 across multiple datasets—including carotid CTA, BraTS2023, ACDC, and AMOS-CT—while reducing computational complexity by up to 77.9% in FLOPs.
📝 Abstract
Accurate 3D medical image segmentation requires both long-range volumetric context and fine boundary preservation. CNN-based methods have limited global dependency modeling, while Transformer-based models are often computationally expensive for dense 3D inputs. Recent Mamba-based methods provide an efficient alternative, but existing volumetric designs still depend on repeated high-resolution scanning, forward-only sequential modeling, and fixed directional summation, causing high cost, scan-order bias, and suboptimal directional aggregation. We propose BiSegMamba, an efficient bidirectional tri-oriented Mamba network for 3D medical image segmentation. BiSegMamba follows a compact-to-detail design, where a progressive compacting stem (PCS) enables efficient latent-space reasoning while retaining shallow high-resolution features for reconstruction. A multi-scale spatial mixer (MSSM) captures local anatomical patterns in early stages, and the proposed bidirectional tri-oriented Ortho Mamba (Bi-ToOM) block models long-range dependencies from multiple orthogonal views using jointly processed forward and backward scan sequences. Adaptive directional fusion (ADF) learns input-dependent channel-wise weights across scan orientations, replacing fixed summation with orientation-aware fusion. Experiments on a collected carotid CTA dataset and three public benchmarks, BraTS2023, ACDC, and AMOS-CT, show that BiSegMamba generalizes well across vascular, cardiac, brain tumor, and abdominal multi-organ segmentation tasks. Compared with SegMamba-V2, BiSegMamba achieves slightly better performance on BraTS2023 and clear improvements on ACDC and the carotid dataset, while reducing computational cost by up to 77.9% FLOPs, demonstrating a strong accuracy-efficiency balance for general 3D medical image segmentation.