π€ AI Summary
To address the challenges of heavy reliance on expert annotations, limited generalizability, and insufficient robustness in 3D MRI segmentation of cartilage and meniscus for knee osteoarthritis monitoring, this paper proposes a memory-augmented vision foundation model (VFM)βthe first to integrate memory mechanisms with spatial awareness. We design a hybrid sorting strategy (HSS) to improve training convergence and spatial consistency, and introduce prompt-driven mask propagation with minimal human interaction (only three clicks per volume) to achieve anatomy-level segmentation. Evaluated on an external test set of 57 cases, our method achieves a mean Dice score improvement of 5.0 points (up to +12.1 for tibial cartilage), reduces cartilage thickness measurement error by a factor of three, significantly decreases annotation burden, and enhances cross-center generalizability.
π Abstract
Accurate morphometric assessment of cartilage-such as thickness/volume-via MRI is essential for monitoring knee osteoarthritis. Segmenting cartilage remains challenging and dependent on extensive expert-annotated datasets, which are heavily subjected to inter-reader variability. Recent advancements in Visual Foundational Models (VFM), especially memory-based approaches, offer opportunities for improving generalizability and robustness. This study introduces a deep learning (DL) method for cartilage and meniscus segmentation from 3D MRIs using interactive, memory-based VFMs. To improve spatial awareness and convergence, we incorporated a Hybrid Shuffling Strategy (HSS) during training and applied a segmentation mask propagation technique to enhance annotation efficiency. We trained four AI models-a CNN-based 3D-VNet, two automatic transformer-based models (SaMRI2D and SaMRI3D), and a transformer-based promptable memory-based VFM (SAMRI-2)-on 3D knee MRIs from 270 patients using public and internal datasets and evaluated on 57 external cases, including multi-radiologist annotations and different data acquisitions. Model performance was assessed against reference standards using Dice Score (DSC) and Intersection over Union (IoU), with additional morphometric evaluations to further quantify segmentation accuracy. SAMRI-2 model, trained with HSS, outperformed all other models, achieving an average DSC improvement of 5 points, with a peak improvement of 12 points for tibial cartilage. It also demonstrated the lowest cartilage thickness errors, reducing discrepancies by up to threefold. Notably, SAMRI-2 maintained high performance with as few as three user clicks per volume, reducing annotation effort while ensuring anatomical precision. This memory-based VFM with spatial awareness offers a novel approach for reliable AI-assisted knee MRI segmentation, advancing DL in musculoskeletal imaging.