A Multimodal Foundation Model to Enhance Generalizability and Data Efficiency for Pan-cancer Prognosis Prediction

📅 2025-09-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing pan-cancer prognostic models struggle to effectively integrate histopathological images, clinical notes, and genomic data, resulting in poor representation generalizability and low data utilization efficiency. To address this, we propose the Multi-Expert Collaborative Embedding (MICE) framework, which employs functionally heterogeneous expert modules to jointly model cross-cancer commonalities and cancer-type-specific features. MICE integrates contrastive learning with supervised learning to achieve multimodal representation alignment and discriminative optimization. Evaluated on 30 cancer types from TCGA and other cohorts comprising 11,799 patients, MICE achieves C-index improvements of 3.8–11.2% on internal validation cohorts and 5.8–8.8% on independent external cohorts. It significantly enhances cross-institutional generalizability and robustness under limited-sample settings. This work establishes a scalable, multimodal, heterogeneous-data-driven paradigm for precision oncology prognosis.

Technology Category

Application Category

📝 Abstract
Multimodal data provides heterogeneous information for a holistic understanding of the tumor microenvironment. However, existing AI models often struggle to harness the rich information within multimodal data and extract poorly generalizable representations. Here we present MICE (Multimodal data Integration via Collaborative Experts), a multimodal foundation model that effectively integrates pathology images, clinical reports, and genomics data for precise pan-cancer prognosis prediction. Instead of conventional multi-expert modules, MICE employs multiple functionally diverse experts to comprehensively capture both cross-cancer and cancer-specific insights. Leveraging data from 11,799 patients across 30 cancer types, we enhanced MICE's generalizability by coupling contrastive and supervised learning. MICE outperformed both unimodal and state-of-the-art multi-expert-based multimodal models, demonstrating substantial improvements in C-index ranging from 3.8% to 11.2% on internal cohorts and 5.8% to 8.8% on independent cohorts, respectively. Moreover, it exhibited remarkable data efficiency across diverse clinical scenarios. With its enhanced generalizability and data efficiency, MICE establishes an effective and scalable foundation for pan-cancer prognosis prediction, holding strong potential to personalize tailored therapies and improve treatment outcomes.
Problem

Research questions and friction points this paper is trying to address.

Integrating multimodal data for pan-cancer prognosis prediction
Improving model generalizability across diverse cancer types
Enhancing data efficiency in clinical prognosis scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal foundation model integrating pathology, clinical, genomics
Multiple functionally diverse experts capturing cross-cancer insights
Contrastive and supervised learning enhancing generalizability and efficiency
🔎 Similar Papers
No similar papers found.
Huajun Zhou
Huajun Zhou
The Hong Kong University of Science and Technology
Computer VisionMedical Image Processing
Fengtao Zhou
Fengtao Zhou
Hong Kong University of Science and Technology
Multimodal LearningComputational Pathology
J
Jiabo Ma
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China.
Yingxue Xu
Yingxue Xu
The Hong Kong University of Science and Technology
Multimodal LearningSurvival AnalysisComputational Pathology
X
Xi Wang
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China.
X
Xiuming Zhang
Department of Pathology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.
Li Liang
Li Liang
The University of Western Australia
3D Point Cloud Processing3D Semantic Scene Completion3D Semantic Scene Generation
Zhenhui Li
Zhenhui Li
the Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Yunnan Cancer
radiomicspathomicscolorectal cancer
H
Hao Chen
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China.; Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China.; Division of Life Science, Hong Kong University of Science and Technology, Hong Kong, China.; HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Futian, Shenzhen, China.; State Key Laboratory of Nervous System Disorders, Hong Kong University of Science and Tech