AdaMHF: Adaptive Multimodal Hierarchical Fusion for Survival Prediction

📅 2025-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address modality heterogeneity, sparsity, and missingness in multimodal survival prediction from histopathological images and genomic data, this paper proposes an adaptive hierarchical multimodal fusion framework. Methodologically: (1) we introduce a novel expert-expansion–residual structure–driven heterogeneous feature activation mechanism; (2) we propose a token selection and weighted aggregation strategy for feature refinement; (3) we design a multi-granularity hierarchical cross-modal encoder; and (4) we construct the first survival prediction benchmark supporting arbitrary modality missingness. Evaluated on the TCGA pan-cancer dataset, our method achieves a C-index 1.8% higher than state-of-the-art methods under full-modality settings, and maintains robust performance (>0.72 C-index) even with single-modality missingness. These results demonstrate substantial improvements in model robustness and clinical deployability.

Technology Category

Application Category

📝 Abstract
The integration of pathologic images and genomic data for survival analysis has gained increasing attention with advances in multimodal learning. However, current methods often ignore biological characteristics, such as heterogeneity and sparsity, both within and across modalities, ultimately limiting their adaptability to clinical practice. To address these challenges, we propose AdaMHF: Adaptive Multimodal Hierarchical Fusion, a framework designed for efficient, comprehensive, and tailored feature extraction and fusion. AdaMHF is specifically adapted to the uniqueness of medical data, enabling accurate predictions with minimal resource consumption, even under challenging scenarios with missing modalities. Initially, AdaMHF employs an experts expansion and residual structure to activate specialized experts for extracting heterogeneous and sparse features. Extracted tokens undergo refinement via selection and aggregation, reducing the weight of non-dominant features while preserving comprehensive information. Subsequently, the encoded features are hierarchically fused, allowing multi-grained interactions across modalities to be captured. Furthermore, we introduce a survival prediction benchmark designed to resolve scenarios with missing modalities, mirroring real-world clinical conditions. Extensive experiments on TCGA datasets demonstrate that AdaMHF surpasses current state-of-the-art (SOTA) methods, showcasing exceptional performance in both complete and incomplete modality settings.
Problem

Research questions and friction points this paper is trying to address.

Integrates pathologic images and genomic data for survival prediction
Addresses heterogeneity and sparsity within and across medical modalities
Handles missing modalities to mimic real-world clinical scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive multimodal hierarchical fusion framework
Experts expansion for heterogeneous feature extraction
Hierarchical fusion for multi-grained modality interaction
🔎 Similar Papers
No similar papers found.
S
Shuaiyu Zhang
School of Computing and Information Technology, Great Bay University; Harbin Institute of Technology
Xun Lin
Xun Lin
Postdoc, CUHK; PhD, Beihang University
Subtle Visual ComputingMedia Security
Rongxiang Zhang
Rongxiang Zhang
Harbin Institute of Technology
Artificial IntelligentSVG generation
Y
Yu Bai
School of Computing and Information Technology, Great Bay University
Y
Yong Xu
Dongguan Key Laboratory for Intelligence and Information Technology
Tao Tan
Tao Tan
FCA MPU
Medical Imaging AI
X
Xunbin Zheng
School of Computing and Information Technology, Great Bay University
Zitong Yu
Zitong Yu
U.S. Food and Drug Administration
Medical imagingDeep learningMachine learningImage reconstruction