Adapting a Segmentation Foundation Model for Medical Image Classification

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical image classification often suffers from weak discriminability of critical anatomical regions due to domain-specific variations and limited labeled data. Method: This work investigates the transferability of segmentation foundation models—specifically the Segment Anything Model (SAM)—to medical image classification. To address anatomical ambiguity without fine-tuning SAM’s large-scale parameters, we freeze its image encoder as a generic feature extractor and introduce a Spatially Localized Channel Attention (SLCA) mechanism that adaptively recalibrates channel-wise feature weights in a spatially aware manner. Contribution/Results: To our knowledge, this is the first successful adaptation of SAM to medical image classification without parameter tuning, achieving an optimal balance between generalizability and computational efficiency. Extensive experiments on three public medical image classification benchmarks demonstrate significant improvements in accuracy and exceptional data efficiency in few-shot settings, validating that segmentation priors effectively enhance classification performance.

Technology Category

Application Category

📝 Abstract
Recent advancements in foundation models, such as the Segment Anything Model (SAM), have shown strong performance in various vision tasks, particularly image segmentation, due to their impressive zero-shot segmentation capabilities. However, effectively adapting such models for medical image classification is still a less explored topic. In this paper, we introduce a new framework to adapt SAM for medical image classification. First, we utilize the SAM image encoder as a feature extractor to capture segmentation-based features that convey important spatial and contextual details of the image, while freezing its weights to avoid unnecessary overhead during training. Next, we propose a novel Spatially Localized Channel Attention (SLCA) mechanism to compute spatially localized attention weights for the feature maps. The features extracted from SAM's image encoder are processed through SLCA to compute attention weights, which are then integrated into deep learning classification models to enhance their focus on spatially relevant or meaningful regions of the image, thus improving classification performance. Experimental results on three public medical image classification datasets demonstrate the effectiveness and data-efficiency of our approach.
Problem

Research questions and friction points this paper is trying to address.

Adapting SAM for medical image classification
Enhancing classification with segmentation-based features
Improving focus on relevant image regions via SLCA
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes SAM image encoder for feature extraction
Introduces Spatially Localized Channel Attention mechanism
Enhances classification focus on relevant image regions
🔎 Similar Papers
No similar papers found.
Pengfei Gu
Pengfei Gu
Assistant Professor in Computer Science, University of Texas Rio Grande Valley
Computer VisionDeep LearningMedical Image AnalysisScientific Visualization
Haoteng Tang
Haoteng Tang
Assistant Professor in Computer Science, University of Texas Rio Grande Valley.
machine learningdata miningmedical image computing and bioinformatics
I
Islam Akef Ebeid
Department of Computer Science, Texas Woman’s University, Denton, TX 76204, USA
J
J. A. Nunez
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
F
Fabian Vazquez
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
D
Diego Adame
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
M
Marcus Zhan
Sewickley Academy, Sewickley, PA 15143, USA
Huimin Li
Huimin Li
Ph.D. @ TU Delft/Postdoc @ TU Darmstadt
Hardware SecurityRISC-VSCAMLFPGA
B
Bin Fu
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
D
Danny Z. Chen
Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA