MEDFORM: A Foundation Model for Contrastive Learning of CT Imaging and Clinical Numeric Data in Multi-Cancer Analysis

📅 2025-01-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Multi-cancer CT classification (lung, breast, colorectal) faces challenges including complex multi-slice structural dependencies, high expert annotation costs, and scarcity of large-scale multimodal pretraining data. Method: This paper proposes a medical foundation model for multi-cancer analysis, employing a two-stage pretraining paradigm: (1) SimCLR-based self-supervised learning to extract discriminative single-slice image features; and (2) cross-modal contrastive learning to align CT image representations with clinical numerical features, augmented by multi-instance learning (MIL) to model whole-sequence scans. Contribution/Results: Pretrained on over 160,000 CT slices, the model achieves significant improvements in cancer classification accuracy and demonstrates strong generalization under few-shot settings. It establishes a novel, low-annotation-dependent paradigm for multimodal medical imaging analysis, advancing scalable and clinically adaptable AI solutions.

Technology Category

Application Category

📝 Abstract

Computed tomography (CT) and clinical numeric data are essential modalities for cancer evaluation, but building large-scale multimodal training datasets for developing medical foundation models remains challenging due to the structural complexity of multi-slice CT data and high cost of expert annotation. In this study, we propose MEDFORM, a multimodal pre-training strategy that guides CT image representation learning using complementary information from clinical data for medical foundation model development. MEDFORM efficiently processes CT slice through multiple instance learning (MIL) and adopts a dual pre-training strategy: first pretraining the CT slice feature extractor using SimCLR-based self-supervised learning, then aligning CT and clinical modalities through cross-modal contrastive learning. Our model was pre-trained on three different cancer types: lung cancer (141,171 slices), breast cancer (8,100 slices), colorectal cancer (10,393 slices). The experimental results demonstrated that this dual pre-training strategy improves cancer classification performance and maintains robust performance in few-shot learning scenarios. Code available at https://github.com/DigitalHealthcareLab/25MultiModalFoundationModel.git

Problem

Research questions and friction points this paper is trying to address.

Cancer Differentiation

Medical Imaging

Machine Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

MEDFORM

self-learning feature extraction

limited annotated data

🔎 Similar Papers

No similar papers found.

Authors to Follow