🤖 AI Summary
Existing 3D medical image segmentation models neglect the temporal dimension, limiting their ability to model dynamic lesion evolution and thereby hindering tumor progression monitoring and treatment response assessment. To address this, we propose the first 4D (3D spatial + temporal) modeling framework for longitudinal CT lesion segmentation, introducing a spatiotemporal Mamba architecture grounded in state space models (SSMs). Our method innovatively designs a quad-directional scanning mechanism and a 4D voxel sequence modeling strategy to jointly encode spatiotemporal lesion dynamics. The proposed quad-directional spatiotemporal Mamba module significantly enhances robustness in detecting regressing lesions. Evaluated on 3,252 clinical CT scans, our model achieves a Dice score of 0.682—comparable to state-of-the-art methods—while offering superior inference efficiency. This work establishes a novel paradigm for temporal medical image segmentation.
📝 Abstract
Accurate segmentation of longitudinal CT scans is important for monitoring tumor progression and evaluating treatment responses. However, existing 3D segmentation models solely focus on spatial information. To address this gap, we propose OmniMamba4D, a novel segmentation model designed for 4D medical images (3D images over time). OmniMamba4D utilizes a spatio-temporal tetra-orientated Mamba block to effectively capture both spatial and temporal features. Unlike traditional 3D models, which analyze single-time points, OmniMamba4D processes 4D CT data, providing comprehensive spatio-temporal information on lesion progression. Evaluated on an internal dataset comprising of 3,252 CT scans, OmniMamba4D achieves a competitive Dice score of 0.682, comparable to state-of-the-arts (SOTA) models, while maintaining computational efficiency and better detecting disappeared lesions. This work demonstrates a new framework to leverage spatio-temporal information for longitudinal CT lesion segmentation.