🤖 AI Summary
To address the three key challenges of Bayesian fine-tuning for Segment Anything Model (SAM) in medical ultrasound image segmentation—instability, high computational overhead, and poor interpretability—this paper proposes E-BayesSAM. Methodologically, it reinterprets SAM’s output tokens as dynamic probability weights and introduces token-wise variational Bayesian inference (T-VBI), a training-free uncertainty quantification mechanism. Further, it couples a self-optimizing Kolmogorov–Arnold network (SO-KAN) to jointly enable token-level interpretable prediction and redundant token pruning. Evaluated on five ultrasound datasets, E-BayesSAM achieves a mean Dice coefficient of 89.0%, with inference latency of only 0.03 seconds per image. Crucially, it identifies four dominant tokens governing segmentation decisions, thereby unifying high accuracy, low latency, and strong interpretability in Bayesian medical image segmentation for the first time.
📝 Abstract
Although the Segment Anything Model (SAM) has advanced medical image segmentation, its Bayesian adaptation for uncertainty-aware segmentation remains hindered by three key issues: (1) instability in Bayesian fine-tuning of large pre-trained SAMs; (2) high computation cost due to SAM's massive parameters; (3) SAM's black-box design limits interpretability. To overcome these, we propose E-BayesSAM, an efficient framework combining Token-wise Variational Bayesian Inference (T-VBI) for efficienty Bayesian adaptation and Self-Optimizing Kolmogorov-Arnold Network (SO-KAN) for improving interpretability. T-VBI innovatively reinterprets SAM's output tokens as dynamic probabilistic weights and reparameterizes them as latent variables without auxiliary training, enabling training-free VBI for uncertainty estimation. SO-KAN improves token prediction with learnable spline activations via self-supervised learning, providing insight to prune redundant tokens to boost efficiency and accuracy. Experiments on five ultrasound datasets demonstrated that E-BayesSAM achieves: (i) real-time inference (0.03s/image), (ii) superior segmentation accuracy (average DSC: Pruned E-BayesSAM's 89.0% vs. E-BayesSAM's 88.0% vs. MedSAM's 88.3%), and (iii) identification of four critical tokens governing SAM's decisions. By unifying efficiency, reliability, and interpretability, E-BayesSAM bridges SAM's versatility with clinical needs, advancing deployment in safety-critical medical applications. The source code is available at https://github.com/mp31192/E-BayesSAM.