🤖 AI Summary
Existing segmentation models (e.g., SAM) employ deterministic architectures, limiting their ability to capture intrinsic segmentation uncertainty arising from inter-expert annotation variability in medical imaging. To address this, we propose probabilistic SAM (pSAM), the first framework to integrate variational inference into the SAM architecture: it introduces a latent variable space that modulates prompt embeddings, jointly learning prior and posterior networks to explicitly model and efficiently sample from the segmentation output distribution. pSAM enables input-driven, diverse yet clinically plausible segmentations while providing calibrated uncertainty quantification. Evaluated on the LIDC-IDRI dataset, pSAM’s predicted segmentation distributions closely align with clinical expert disagreement—demonstrating significant improvements over state-of-the-art methods in both uncertainty calibration and segmentation diversity metrics. This work establishes a new paradigm for trustworthy, uncertainty-aware segmentation in medical imaging.
📝 Abstract
Recent advances in promptable segmentation, such as the Segment Anything Model (SAM), have enabled flexible, high-quality mask generation across a wide range of visual domains. However, SAM and similar models remain fundamentally deterministic, producing a single segmentation per object per prompt, and fail to capture the inherent ambiguity present in many real-world tasks. This limitation is particularly troublesome in medical imaging, where multiple plausible segmentations may exist due to annotation uncertainty or inter-expert variability. In this paper, we introduce Probabilistic SAM, a probabilistic extension of SAM that models a distribution over segmentations conditioned on both the input image and prompt. By incorporating a latent variable space and training with a variational objective, our model learns to generate diverse and plausible segmentation masks reflecting the variability in human annotations. The architecture integrates a prior and posterior network into the SAM framework, allowing latent codes to modulate the prompt embeddings during inference. The latent space allows for efficient sampling during inference, enabling uncertainty-aware outputs with minimal overhead. We evaluate Probabilistic SAM on the public LIDC-IDRI lung nodule dataset and demonstrate its ability to produce diverse outputs that align with expert disagreement, outperforming existing probabilistic baselines on uncertainty-aware metrics. Our code is available at: https://github.com/tbwa233/Probabilistic-SAM/.