Multimodal Machine Learning for Early Prediction of Metastasis in a Swedish Multi-Cancer Cohort

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of early metastasis risk prediction in cancer patients by proposing a multimodal deep learning framework that forecasts risk one month in advance to enable timely intervention. Leveraging electronic health records from Karolinska University Hospital in Sweden across four major cancers—breast, colorectal, prostate, and lung—the model integrates demographic data, comorbidities, laboratory results, medication history, and clinical text using an intermediate fusion strategy. Model interpretability is enhanced through SHAP analysis. The approach achieves F1 scores of 0.845, 0.786, and 0.845 for breast, colorectal, and prostate cancers, respectively, while for lung cancer, the text-only modality yields the best performance (F1 = 0.829). Overall, the method outperforms conventional models, demonstrating the efficacy of intermediate multimodal fusion and interpretable AI in oncological prognosis prediction.
📝 Abstract
Multimodal Machine Learning offers a holistic view of a patient's status, integrating structured and unstructured data from electronic health records (EHR). We propose a framework to predict metastasis risk one month prior to diagnosis, using six months of clinical history from EHR data. Data from four cancer cohorts collected at Karolinska University Hospital (Stockholm, Sweden) were analyzed: breast (n = 743), colon (n = 387), lung (n = 870), and prostate (n = 1890). The dataset included demographics, comorbidities, laboratory results, medications, and clinical text. We compared traditional and deep learning classifiers across single modalities and multimodal combinations, using various fusion strategies and a Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) 2a design, with an 80-20 development-validation split to ensure a rigorous, repeatable evaluation. Performance was evaluated using AUROC, AUPRC, F1 score, sensitivity, and specificity. We then employed a multimodal adaptation of SHAP to analyze the classifiers' reasoning. Intermediate fusion achieved the highest F1 scores on breast (0.845), colon (0.786), and prostate cancer (0.845), demonstrating strong predictive performance. For lung cancer, the intermediate fusion achieved an F1 score of 0.819, while the text-only model achieved the highest, with an F1 score of 0.829. Deep learning classifiers consistently outperformed traditional models. Colon cancer, the smallest cohort, had the lowest performance, highlighting the importance of sufficient training data. SHAP analysis showed that the relative importance of modalities varied across cancer types. Fusion strategies offer distinct strengths and weaknesses. Intermediate fusion consistently delivered the best results, but strategy choices should align with data characteristics and organizational needs.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Machine Learning
Metastasis Prediction
Electronic Health Records
Early Diagnosis
Multi-Cancer Cohort
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Machine Learning
Intermediate Fusion
Early Metastasis Prediction
SHAP Interpretability
Electronic Health Records
🔎 Similar Papers
No similar papers found.
F
Franco Rugolon
Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden
Korbinian Randl
Korbinian Randl
PhD student at Stockholm University
Explainable Machine Learning in NLP
B
Braslav Jovanovic
I
Ioanna Miliou
P
Panagiotis Papapetrou