Decouple, Reorganize, and Fuse: A Multimodal Framework for Cancer Survival Prediction

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current cancer survival prediction models face two key limitations in multimodal fusion: (1) fixed fusion strategies (e.g., concatenation, attention) hinder feature disentanglement and dynamic cross-modal interaction; (2) expert mixture (MoE)-based approaches isolate modality-specific experts, impeding inter-modal information exchange. To address these, we propose the Decoupling-Reorganization-Fusion (DRF) framework. First, a modality decoupling module separates heterogeneous features into disentangled representations. Second, a stochastic feature reorganization mechanism breaks predefined fusion pathways, enhancing feature diversity and inter-expert information flow. Third, a region-wise cross-attention mechanism coupled with a dynamic MoE fusion module improves both disentanglement quality and fusion adaptability. Evaluated on a proprietary hepatocellular carcinoma dataset and three TCGA cohorts, DRF significantly improves concordance index (C-index) and calibration metrics, demonstrating strong generalizability and clinical applicability.

Technology Category

Application Category

📝 Abstract
Cancer survival analysis commonly integrates information across diverse medical modalities to make survival-time predictions. Existing methods primarily focus on extracting different decoupled features of modalities and performing fusion operations such as concatenation, attention, and MoE-based (Mixture-of-Experts) fusion. However, these methods still face two key challenges: i) Fixed fusion schemes (concatenation and attention) can lead to model over-reliance on predefined feature combinations, limiting the dynamic fusion of decoupled features; ii) in MoE-based fusion methods, each expert network handles separate decoupled features, which limits information interaction among the decoupled features. To address these challenges, we propose a novel Decoupling-Reorganization-Fusion framework (DeReF), which devises a random feature reorganization strategy between modalities decoupling and dynamic MoE fusion modules.Its advantages are: i) it increases the diversity of feature combinations and granularity, enhancing the generalization ability of the subsequent expert networks; ii) it overcomes the problem of information closure and helps expert networks better capture information among decoupled features. Additionally, we incorporate a regional cross-attention network within the modality decoupling module to improve the representation quality of decoupled features. Extensive experimental results on our in-house Liver Cancer (LC) and three widely used TCGA public datasets confirm the effectiveness of our proposed method. The code will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

Overcoming fixed fusion schemes limiting dynamic feature integration
Addressing information isolation in MoE-based multimodal fusion methods
Enhancing cancer survival prediction through improved feature interaction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random feature reorganization strategy
Dynamic MoE fusion modules
Regional cross-attention network
🔎 Similar Papers
2024-04-03IEEE Transactions on Medical ImagingCitations: 0
Huayi Wang
Huayi Wang
Shanghai Jiao Tong University
RoboticsReinforcement Learning
H
Haochao Ying
State Key Laboratory of Transvascular Implantation Devices of the Second Affiliated Hospital, Zhejiang University School of Medicine, and Transvascular Implantation Devices Research Institute, Hangzhou 310009, China; School of Public Health, Zhejiang University, Hangzhou 310058, China
Y
Yuyang Xu
College of Computer Science and Technology, Zhejiang University, Hangzhou 310012, China; State Key Laboratory of Transvascular Implantation Devices of the Second Affiliated Hospital, Zhejiang University School of Medicine, and Transvascular Implantation Devices Research Institute, Hangzhou 310009, China; Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence, Hangzhou 310058, China
Qibo Qiu
Qibo Qiu
Zhejiang University
computer visiondeep learning
C
Cheng Zhang
State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
D
Danny Z. Chen
Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
Y
Ying Sun
State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
J
Jian Wu
State Key Laboratory of Transvascular Implantation Devices of the Second Affiliated Hospital, Zhejiang University School of Medicine, and Transvascular Implantation Devices Research Institute, Hangzhou 310009, China; School of Public Health, Zhejiang University, Hangzhou 310058, China; Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence, Hangzhou 310058, China