Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST

📅 2025-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the transferability of mainstream foundational models—including CNNs (ResNet) and Transformers (ViT, ConvNeXt)—for medical image classification on the unified MedMNIST benchmark. Method: We adopt two transfer learning paradigms—end-to-end fine-tuning and linear probing—to quantitatively analyze the impact of architectural choice, input resolution, and training data scale on classification accuracy. Contribution/Results: To our knowledge, this is the first横向 comparative evaluation of multiple foundational architectures on a standardized medical imaging benchmark. Our analysis reveals a non-trivial trade-off between model architecture and data regime: linear probing exhibits superior robustness in low-data regimes, whereas end-to-end fine-tuning significantly outperforms it at scale. Experiments confirm strong cross-domain transferability of pre-trained models to medical tasks, with several configurations achieving state-of-the-art accuracy on MedMNIST. Based on these findings, we propose a practical, clinically informed model selection guideline for real-world deployment.

Technology Category

Application Category

📝 Abstract
Foundation models are widely employed in medical image analysis, due to their high adaptability and generalizability for downstream tasks. With the increasing number of foundation models being released, model selection has become an important issue. In this work, we study the capabilities of foundation models in medical image classification tasks by conducting a benchmark study on the MedMNIST dataset. Specifically, we adopt various foundation models ranging from convolutional to Transformer-based models and implement both end-to-end training and linear probing for all classification tasks. The results demonstrate the significant potential of these pre-trained models when transferred for medical image classification. We further conduct experiments with different image sizes and various sizes of training data. By analyzing all the results, we provide preliminary, yet useful insights and conclusions on this topic.
Problem

Research questions and friction points this paper is trying to address.

Medical Image Recognition
Model Selection
Accuracy Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Medical Image Classification
Pre-trained Models
Performance Evaluation
🔎 Similar Papers
No similar papers found.
Fuping Wu
Fuping Wu
University of Oxford
Medical Image AnalysisSemi-supervised LearningUnsupervised Learning
B
Bartlomiej W. Papiez
Nuffield Department of Population Health, University of Oxford, UK; Big Data Institute, University of Oxford, UK