Adapting a Segmentation Foundation Model for Medical Image Classification

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Medical image classification often suffers from weak discriminability of critical anatomical regions due to domain-specific variations and limited labeled data. Method: This work investigates the transferability of segmentation foundation models—specifically the Segment Anything Model (SAM)—to medical image classification. To address anatomical ambiguity without fine-tuning SAM’s large-scale parameters, we freeze its image encoder as a generic feature extractor and introduce a Spatially Localized Channel Attention (SLCA) mechanism that adaptively recalibrates channel-wise feature weights in a spatially aware manner. Contribution/Results: To our knowledge, this is the first successful adaptation of SAM to medical image classification without parameter tuning, achieving an optimal balance between generalizability and computational efficiency. Extensive experiments on three public medical image classification benchmarks demonstrate significant improvements in accuracy and exceptional data efficiency in few-shot settings, validating that segmentation priors effectively enhance classification performance.

Technology Category

Application Category

📝 Abstract

Recent advancements in foundation models, such as the Segment Anything Model (SAM), have shown strong performance in various vision tasks, particularly image segmentation, due to their impressive zero-shot segmentation capabilities. However, effectively adapting such models for medical image classification is still a less explored topic. In this paper, we introduce a new framework to adapt SAM for medical image classification. First, we utilize the SAM image encoder as a feature extractor to capture segmentation-based features that convey important spatial and contextual details of the image, while freezing its weights to avoid unnecessary overhead during training. Next, we propose a novel Spatially Localized Channel Attention (SLCA) mechanism to compute spatially localized attention weights for the feature maps. The features extracted from SAM's image encoder are processed through SLCA to compute attention weights, which are then integrated into deep learning classification models to enhance their focus on spatially relevant or meaningful regions of the image, thus improving classification performance. Experimental results on three public medical image classification datasets demonstrate the effectiveness and data-efficiency of our approach.

Problem

Research questions and friction points this paper is trying to address.

Adapting SAM for medical image classification

Enhancing classification with segmentation-based features

Improving focus on relevant image regions via SLCA

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes SAM image encoder for feature extraction

Introduces Spatially Localized Channel Attention mechanism

Enhances classification focus on relevant image regions

🔎 Similar Papers

No similar papers found.

Authors to Follow