🤖 AI Summary
To address poor generalization and inadequate uncertainty calibration of the Segment Anything Model (SAM) in brain MRI segmentation—caused by domain shift and overconfidence—this paper proposes CalSAM, a lightweight adaptive framework. CalSAM freezes SAM’s encoder and fine-tunes only the decoder. It introduces two novel penalties: (i) a feature-level Fisher information penalty to suppress domain sensitivity, and (ii) a voxel-wise confidence misalignment penalty to calibrate prediction reliability. This dual-penalty mechanism jointly enhances robustness and calibration across multi-center and multi-device scenarios while preserving computational efficiency. On the BraTS scanner transfer task, CalSAM achieves a Dice Similarity Coefficient (DSC) of 80.1% (+significant gain) and reduces Hausdorff Distance at 95% (HD95) by 26.9%. On ATLAS-C motion-corrupted data, it attains a DSC of 75.9% and reduces Expected Calibration Error (ECE) by 32.6%, consistently outperforming baselines. These results validate CalSAM’s effectiveness and strong cross-domain generalizability.
📝 Abstract
The Segment Anything Model (SAM) exhibits strong zero-shot performance on natural images but suffers from domain shift and overconfidence when applied to medical volumes. We propose extbf{CalSAM}, a lightweight adaptation framework that (i) reduces encoder sensitivity to domain shift via a emph{Feature Fisher Information Penalty} (FIP) computed on 3D feature maps and (ii) penalizes overconfident voxel-wise errors through a emph{Confidence Misalignment Penalty} (CMP). The combined loss, (mathcal{L}_{mathrm{CalSAM}}) fine-tunes only the mask decoder while keeping SAM's encoders frozen. On cross-center and scanner-shift evaluations, CalSAM substantially improves accuracy and calibration: e.g., on the BraTS scanner split (Siemens$ o$GE) CalSAM shows a $+7.4%$ relative improvement in $mathrm{DSC}$ (80.1% vs. 74.6%), a $-26.9%$ reduction in $mathrm{HD95}$ (4.6 mm vs. 6.3 mm), and a $-39.5%$ reduction in $mathrm{ECE}$ (5.2% vs. 8.6%). On ATLAS-C (motion corruptions), CalSAM achieves a $+5.3%$ relative improvement in $mathrm{DSC}$ (75.9%) and a $-32.6%$ reduction in $mathrm{ECE}$ (5.8%). Ablations show FIP and CMP contribute complementary gains ($p<0.01$), and the Fisher penalty incurs a modest $sim$15% training-time overhead. CalSAM therefore delivers improved domain generalization and better-calibrated uncertainty estimates for brain MRI segmentation, while retaining the computational benefits of freezing SAM's encoder.